function [x, k] = steepest_descent(g, x0, step_size, tolerance, max_iterations) % % An implementation of the steepest-descent algorithm, with the % search direction d_{k} = -\nabla f(x_{k}). % % We should terminate when either, % % a) The 2-norm of the gradient at x_{k} is greater than % ``tolerance``. % % b) We have performed ``max_iterations`` iterations. % % INPUT: % % * ``g`` - the gradient of the function ``f`` we're minimizing. % % * ``x0`` - the starting point for the search. % % * ``step_size`` - a function of x which returns the optimal step % size alpha_{k}. This will generally require more information % than just x; for example it might need the function ``f`` or the % gradient ``g``. However, our caller has this information so it % should be incorporated into the step size algorithm that we % receive. % % * ``tolerance`` - the stopping tolerance. We stop when the norm % of the gradient falls below this value. % % * ``max_iterations`` - a safety net; we return x_{k} % unconditionally if we exceed this number of iterations. % % OUTPUT: % % * ``x`` - the solution at the final value of x_{k}. % % * ``k`` - the value of k when we stop; i.e. the number of % iterations. % % NOTES: % % A specialized implementation for solving e.g. Qx=b can avoid one % matrix-vector multiplication below. % % The initial gradient at x_{0} is not supplied, so we compute it % here and begin the loop at k=0. k = 0; xk = x0; gk = g(xk); while (k <= max_iterations) % Loop until either of our stopping conditions are met. If the % loop finishes, we have implicitly met the second stopping % condition (number of iterations). if (norm(g_k) < tolerance) # This catches the k=0 case, too. x = xk; return; end dk = -gk; alpha_k = step_size(xk); xk = xk + (alpha_k * dk); gk = g(xk); % We potentially just performed one more iteration than necessary % in order to simplify the loop. Note that due to the structure of % our loop, we will have k > max_iterations when we fail to % converge. k = k + 1; end x = xk; end