optimization/preconditioned_conjugate_gradient_method.m

   1 function [x, k] = preconditioned_conjugate_gradient_method(Q, ...
   2                                                            M, ...
   3                                                            b, ...
   4                                                            x0, ...
   5                                                            tolerance, ...
   6                                                            max_iterations)
   7   %
   8   % Solve,
   9   %
  10   %   Qx = b
  11   %
  12   % or equivalently,
  13   %
  14   %   min [phi(x) = (1/2)*<Qx,x> + <b,x>]
  15   %
  16   % using the preconditioned conjugate gradient method (14.56 in
  17   % Guler). If ``M`` is the identity matrix, we use the slightly
  18   % faster implementation in conjugate_gradient_method.m.
  19   %
  20   % INPUT:
  21   %
  22   %   - ``Q`` -- The coefficient matrix of the system to solve. Must
  23   %     be positive definite.
  24   %
  25   %   - ``M`` -- The preconditioning matrix. If the actual matrix used
  26   %     to precondition ``Q`` is called ``C``, i.e. ``C^(-1) * Q *
  27   %     C^(-T) == \bar{Q}``, then M=CC^T. However the matrix ``C`` is
  28   %     never itself needed. This is explained in Guler, section 14.9.
  29   %
  30   %   - ``b`` -- The right-hand-side of the system to solve.
  31   %
  32   %   - ``x0`` -- The starting point for the search.
  33   %
  34   %   - ``tolerance`` -- How close ``Qx`` has to be to ``b`` (in
  35   %     magnitude) before we stop.
  36   %
  37   %   - ``max_iterations`` -- The maximum number of iterations to
  38   %     perform.
  39   %
  40   % OUTPUT:
  41   %
  42   %   - ``x`` - The computed solution to Qx=b.
  43   %
  44   %   - ``k`` - The ending value of k; that is, the number of
  45   %   iterations that were performed.
  46   %
  47   % NOTES:
  48   %
  49   % All vectors are assumed to be *column* vectors.
  50   %
  51   % The cited algorithm contains a typo; in "The Preconditioned
  52   % Conjugate-Gradient Method", we are supposed to define
  53   % d_{0} = -z_{0}, not -r_{0} as written.
  54   %
  55   % The rather verbose name of this function was chosen to avoid
  56   % conflicts with other implementations.
  57   %
  58   % REFERENCES:
  59   %
  60   %   1. Guler, Osman. Foundations of Optimization. New York, Springer,
  61   %      2010.
  62   %
  63   %   2. Shewchuk, Jonathan Richard. An Introduction to the Conjugate
  64   %      Gradient Method Without the Agonizing Pain, Edition 1.25.
  65   %      August 4, 1994.
  66   %
  67
  68   % We use this in the inner loop.
  69   n = length(x0);
  70   sqrt_n = floor(sqrt(n));
  71
  72   % Set k=0 first, that way the references to xk,rk,zk,dk which
  73   % immediately follow correspond (semantically) to x0,r0,z0,d0.
  74   k = 0;
  75
  76   xk = x0;
  77   rk = Q*xk - b;
  78   zk = M \ rk;
  79   dk = -zk;
  80
  81   while (k <= max_iterations && norm(rk, 'inf') > tolerance)
  82     % Used twice, avoid recomputation.
  83     rkzk = rk' * zk;
  84
  85     % The term alpha_k*dk appears twice, but so does Q*dk. We can't
  86     % do them both, so we precompute the more expensive operation.
  87     Qdk = Q * dk;
  88
  89     % We're going to divide by this quantity...
  90     dkQdk = dk' * Qdk;
  91
  92     % So if it's too close to zero, we replace it with something
  93     % comparable but non-zero.
  94     if (dkQdk < eps)
  95       dkQdk = eps;
  96     end
  97
  98     alpha_k = rkzk/dkQdk;
  99     x_next = xk + (alpha_k * dk);
 100
 101     % The recursive definition of r_next is prone to accumulate
 102     % roundoff error. When sqrt(n) divides k, we recompute the
 103     % residual to minimize this error. This modification was suggested
 104     % by the second reference.
 105     if (mod(k, sqrt_n) == 0)
 106       r_next = Q*x_next - b;
 107     else
 108       r_next = rk + (alpha_k * Qdk);
 109     end
 110
 111     z_next = M \ r_next;
 112     beta_next = (r_next' * z_next)/rkzk;
 113     d_next = -z_next + beta_next*dk;
 114
 115     % We potentially just performed one more iteration than necessary
 116     % in order to simplify the loop. Note that due to the structure of
 117     % our loop, we will have k > max_iterations when we fail to
 118     % converge.
 119     k = k + 1;
 120     xk = x_next;
 121     rk = r_next;
 122     zk = z_next;
 123     dk = d_next;
 124   end
 125
 126   % If we make it here, one of the two stopping conditions was met.
 127   x = xk;
 128 end