optimization/preconditioned_conjugate_gradient_method.m

   1 function [x, k] = preconditioned_conjugate_gradient_method(A,
   2                                                            M,
   3                                                            b,
   4                                                            x0,
   5                                                            tolerance,
   6                                                            max_iterations)
   7   %
   8   % Solve,
   9   %
  10   %   Ax = b
  11   %
  12   % or equivalently,
  13   %
  14   %   min [phi(x) = (1/2)*<Ax,x> + <b,x>]
  15   %
  16   % using the preconditioned conjugate gradient method (14.56 in
  17   % Guler). If ``M`` is the identity matrix, we use the slightly
  18   % faster implementation in conjugate_gradient_method.m.
  19   %
  20   % INPUT:
  21   %
  22   %   - ``A`` -- The coefficient matrix of the system to solve. Must
  23   %     be positive definite.
  24   %
  25   %   - ``M`` -- The preconditioning matrix. If the actual matrix used
  26   %     to precondition ``A`` is called ``C``, i.e. ``C^(-1) * Q *
  27   %     C^(-T) == \bar{Q}``, then M=CC^T. However the matrix ``C`` is
  28   %     never itself needed. This is explained in Guler, section 14.9.
  29   %
  30   %   - ``b`` -- The right-hand-side of the system to solve.
  31   %
  32   %   - ``x0`` -- The starting point for the search.
  33   %
  34   %   - ``tolerance`` -- How close ``Ax`` has to be to ``b`` (in
  35   %     magnitude) before we stop.
  36   %
  37   %   - ``max_iterations`` -- The maximum number of iterations to
  38   %     perform.
  39   %
  40   % OUTPUT:
  41   %
  42   %   - ``x`` - The solution to Ax=b.
  43   %
  44   %   - ``k`` - The ending value of k; that is, the number of
  45   %   iterations that were performed.
  46   %
  47   % NOTES:
  48   %
  49   % All vectors are assumed to be *column* vectors.
  50   %
  51   % The cited algorithm contains a typo; in "The Preconditioned
  52   % Conjugate-Gradient Method", we are supposed to define
  53   % d_{0} = -z_{0}, not -r_{0} as written.
  54   %
  55   % REFERENCES:
  56   %
  57   %   1. Guler, Osman. Foundations of Optimization. New York, Springer,
  58   %   2010.
  59   %
  60   n = length(x0);
  61
  62   if (isequal(M, eye(n)))
  63     [x, k] = conjugate_gradient_method(A, b, x0, tolerance, max_iterations);
  64     return;
  65   end
  66
  67   zero_vector = zeros(n, 1);
  68
  69   k = 0;
  70   x = x0; % Eschew the 'k' suffix on 'x' for simplicity.
  71   rk = A*x - b; % The first residual must be computed the hard way.
  72   zk = M \ rk;
  73   dk = -zk;
  74
  75   for k = [ 0 : max_iterations ]
  76     if (norm(rk) < tolerance)
  77        % Success.
  78        return;
  79     end
  80
  81     % Unfortunately, since we don't know the matrix ``C``, it isn't
  82     % easy to compute alpha_k with an existing step size function.
  83     alpha_k = (rk' * zk)/(dk' * A * dk);
  84     x_next = x + alpha_k*dk;
  85     r_next = rk + alpha_k*A*dk;
  86     z_next = M \ r_next;
  87     beta_next = (r_next' * z_next)/(rk' * zk);
  88     d_next = -z_next + beta_next*dk;
  89
  90     k = k + 1;
  91     x = x_next;
  92     rk = r_next;
  93     zk = z_next;
  94     dk = d_next;
  95   end
  96 end