Add the first working version of the preconditioned CGM.
[octave.git] / optimization / preconditioned_conjugate_gradient_method.m
1 function [x, k] = preconditioned_conjugate_gradient_method(A,
2 M,
3 b,
4 x0,
5 tolerance,
6 max_iterations)
7 %
8 % Solve,
9 %
10 % Ax = b
11 %
12 % or equivalently,
13 %
14 % min [phi(x) = (1/2)*<Ax,x> + <b,x>]
15 %
16 % using the preconditioned conjugate gradient method (14.56 in
17 % Guler). If ``M`` is the identity matrix, we use the slightly
18 % faster implementation in conjugate_gradient_method.m.
19 %
20 % INPUT:
21 %
22 % - ``A`` -- The coefficient matrix of the system to solve. Must
23 % be positive definite.
24 %
25 % - ``M`` -- The preconditioning matrix. If the actual matrix used
26 % to precondition ``A`` is called ``C``, i.e. ``C^(-1) * Q *
27 % C^(-T) == \bar{Q}``, then M=CC^T. However the matrix ``C`` is
28 % never itself needed. This is explained in Guler, section 14.9.
29 %
30 % - ``b`` -- The right-hand-side of the system to solve.
31 %
32 % - ``x0`` -- The starting point for the search.
33 %
34 % - ``tolerance`` -- How close ``Ax`` has to be to ``b`` (in
35 % magnitude) before we stop.
36 %
37 % - ``max_iterations`` -- The maximum number of iterations to
38 % perform.
39 %
40 % OUTPUT:
41 %
42 % - ``x`` - The solution to Ax=b.
43 %
44 % - ``k`` - The ending value of k; that is, the number of
45 % iterations that were performed.
46 %
47 % NOTES:
48 %
49 % All vectors are assumed to be *column* vectors.
50 %
51 % The cited algorithm contains a typo; in "The Preconditioned
52 % Conjugate-Gradient Method", we are supposed to define
53 % d_{0} = -z_{0}, not -r_{0} as written.
54 %
55 % REFERENCES:
56 %
57 % 1. Guler, Osman. Foundations of Optimization. New York, Springer,
58 % 2010.
59 %
60 n = length(x0);
61
62 if (isequal(M, eye(n)))
63 [x, k] = conjugate_gradient_method(A, b, x0, tolerance, max_iterations);
64 return;
65 end
66
67 zero_vector = zeros(n, 1);
68
69 k = 0;
70 x = x0; % Eschew the 'k' suffix on 'x' for simplicity.
71 rk = A*x - b; % The first residual must be computed the hard way.
72 zk = M \ rk;
73 dk = -zk;
74
75 for k = [ 0 : max_iterations ]
76 if (norm(rk) < tolerance)
77 % Success.
78 return;
79 end
80
81 % Unfortunately, since we don't know the matrix ``C``, it isn't
82 % easy to compute alpha_k with an existing step size function.
83 alpha_k = (rk' * zk)/(dk' * A * dk);
84 x_next = x + alpha_k*dk;
85 r_next = rk + alpha_k*A*dk;
86 z_next = M \ r_next;
87 beta_next = (r_next' * z_next)/(rk' * zk);
88 d_next = -z_next + beta_next*dk;
89
90 k = k + 1;
91 x = x_next;
92 rk = r_next;
93 zk = z_next;
94 dk = d_next;
95 end
96 end