Use the infinity norm in steepest_descent(), update its comments, and simplify the...
[octave.git] / optimization / steepest_descent.m
1 function [x, k] = steepest_descent(g, ...
2 x0, ...
3 step_size, ...
4 tolerance, ...
5 max_iterations)
6 %
7 % An implementation of the steepest-descent algorithm, with the
8 % search direction d_{k} = -\nabla f(x_{k}).
9 %
10 % We should terminate when either,
11 %
12 % a) The infinity-norm of the gradient at x_{k} is greater than
13 % ``tolerance``.
14 %
15 % b) We have performed ``max_iterations`` iterations.
16 %
17 % INPUT:
18 %
19 % * ``g`` - the gradient of the function ``f`` we're minimizing.
20 %
21 % * ``x0`` - the starting point for the search.
22 %
23 % * ``step_size`` - a function of x which returns the optimal step
24 % size alpha_{k}. This will generally require more information
25 % than just x; for example it might need the function ``f`` or the
26 % gradient ``g``. However, our caller has this information so it
27 % should be incorporated into the step size algorithm that we
28 % receive.
29 %
30 % * ``tolerance`` - the stopping tolerance. We stop when the norm
31 % of the gradient falls below this value.
32 %
33 % * ``max_iterations`` - a safety net; we return x_{k}
34 % unconditionally if we exceed this number of iterations.
35 %
36 % OUTPUT:
37 %
38 % * ``x`` - the solution at the final value of x_{k}.
39 %
40 % * ``k`` - the value of k when we stop; i.e. the number of
41 % iterations.
42 %
43 % NOTES:
44 %
45 % A specialized implementation for solving e.g. Qx=b can avoid one
46 % matrix-vector multiplication below.
47 %
48
49 % The initial gradient at x_{0} is not supplied, so we compute it
50 % here and begin the loop at k=0.
51 k = 0;
52 xk = x0;
53 gk = g(xk);
54
55 while (k <= max_iterations && norm(gk, 'inf') > tolerance)
56 % Loop until either of our stopping conditions are met. This
57 % should also work when x0 is a satisfactory point.
58
59 dk = -gk;
60 alpha_k = step_size(xk);
61 xk = xk + (alpha_k * dk);
62 gk = g(xk);
63
64 % We potentially just performed one more iteration than necessary
65 % in order to simplify the loop. Note that due to the structure of
66 % our loop, we will have k > max_iterations when we fail to
67 % converge.
68 k = k + 1;
69 end
70
71 x = xk;
72 end