The (L-)BFGS Algorithm

The BFGS (Broyden-Fletcher-Goldfarb-Shanno) Algorithm is a Quasi Newton Method. It is based on the Newton Method which iteratively determines a root of an N-dimensional function (based on the first order Taylor approximation of this function). Since at a minimum of a (cost-)function the gradient is equal to zero, the method can also be used as a minimization algorithm. The algorithm needs to be able to evaluate the gradient of the function in any N-dimensional parameter. Since the inverse of the Hessian is also needed but hard to find, a certain approximation of this inverse is determined at each iteration (hence the term Quasi). A user should provide an initial parameter guess (and possibly an initial approximation of the inverse of the Hessian) and the method determines iteratively better approximations of the actual minimum, after updating the approximation of the inverse of the Hessian. This is done by reducing the multi-dimensional problem into a sequence of line minimizations. The line minimizations are done by Brent's method, which determines the minimum of the (cost-)function on a line with starting point the current parameters and direction the so-called search direction. The search direction is determined by the approximation of the inverse Hessian. The iteration is performed until one of the stopping criteria is satisfied.

The L-BFGS (Limited Memory-BFGS) Algorithm does not compute and store an entire matrix but instead uses a number of vectors of updates of the position and of the gradient, which represent the approximation of the inverse Hessian implicitly.