Sharpened Quasi-Newton Methods with Explicit Superlinear Rates
Though the Newton method is known to achieve superlinear rates, the non-asymptotic rate of convergence of quasi-Newton methods remained an open question till 2019. In a series of papers published by Nesterov in 2020, such non-asymptotic rates for various quasi-Newton methods was established, and since then, a number of works have been able to compare different quasi-Newton methods on the basis of their radius of convergence as well as convergence rate. To begin with, several recent works have established a non-asymptotic superlinear rate of O(exp(t*log(t))) for the classic BFGS method by exploiting the fact that its Newton direction approximation error approaches zero. Of more interest however is the Greedy BFGS method, which accelerates its convergence by directly approximating the Hessian, instead of the Newton direction, and achieves a fast local quadratic convergence rate. Alas, the local quadratic convergence of Greedy-BFGS requires way more updates compared to the number of iterations that BFGS requires for a local superlinear rate. This is due to the fact that in Greedy-BFGS, the Hessian is directly approximated and the Newton direction approximation may not be as accurate as the one for BFGS. We have proposed the Sharpened BFGS method, which closes the gap by leveraging the approximation ideas of both BFGS and Greedy-BFGS to properly approximate the Newton direction and the Hessian matrix simultaneously. Our theoretical results show that our method out-performs both BFGS and Greedy-BFGS in terms of convergence rate, while it reaches its quadratic convergence rate with fewer steps compared to Greedy-BFGS. Numerical experiments on various datasets also confirm our theoretical findings.
Ketan Rajawat (S’06–M’12) received his B.Tech and M.Tech degrees in Electrical Engineering from the Indian Institute of Technology (IIT) Kanpur, India, in 2007, and his Ph.D. degree in Electrical and Computer Engineering from the University of Minnesota, Minneapolis, MN, USA, in 2012. He is currently an Associate Professor in the Department of Electrical Engineering, IIT Kanpur. His research interests are in the broad areas of signal processing, robotics, and communications networks, with particular emphasis on distributed optimization and online learning. His current research focuses on the development and analysis of distributed and asynchronous optimization algorithms, online convex optimization algorithms, stochastic optimization algorithms, and the application of these algorithms to problems in machine learning, communications, signal processing, and smart grid systems. He is currently serving as an Associate Editor with the IEEE Transactions on Signal Processing. He is also the recipient of the 2018 INSA Medal for Young Scientists and the2019 INAE Young Engineer Award.