optimization/preconditioned_conjugate_gradient_method.m

   1 function [x, k] = preconditioned_conjugate_gradient_method(Q,
   2                                                            M,
   3                                                            b,
   4                                                            x0,
   5                                                            tolerance,
   6                                                            max_iterations)
   7   %
   8   % Solve,
   9   %
  10   %   Qx = b
  11   %
  12   % or equivalently,
  13   %
  14   %   min [phi(x) = (1/2)*<Qx,x> + <b,x>]
  15   %
  16   % using the preconditioned conjugate gradient method (14.56 in
  17   % Guler). If ``M`` is the identity matrix, we use the slightly
  18   % faster implementation in conjugate_gradient_method.m.
  19   %
  20   % INPUT:
  21   %
  22   %   - ``Q`` -- The coefficient matrix of the system to solve. Must
  23   %     be positive definite.
  24   %
  25   %   - ``M`` -- The preconditioning matrix. If the actual matrix used
  26   %     to precondition ``Q`` is called ``C``, i.e. ``C^(-1) * Q *
  27   %     C^(-T) == \bar{Q}``, then M=CC^T. However the matrix ``C`` is
  28   %     never itself needed. This is explained in Guler, section 14.9.
  29   %
  30   %   - ``b`` -- The right-hand-side of the system to solve.
  31   %
  32   %   - ``x0`` -- The starting point for the search.
  33   %
  34   %   - ``tolerance`` -- How close ``Qx`` has to be to ``b`` (in
  35   %     magnitude) before we stop.
  36   %
  37   %   - ``max_iterations`` -- The maximum number of iterations to
  38   %     perform.
  39   %
  40   % OUTPUT:
  41   %
  42   %   - ``x`` - The solution to Qx=b.
  43   %
  44   %   - ``k`` - The ending value of k; that is, the number of
  45   %   iterations that were performed.
  46   %
  47   % NOTES:
  48   %
  49   % All vectors are assumed to be *column* vectors.
  50   %
  51   % The cited algorithm contains a typo; in "The Preconditioned
  52   % Conjugate-Gradient Method", we are supposed to define
  53   % d_{0} = -z_{0}, not -r_{0} as written.
  54   %
  55   % REFERENCES:
  56   %
  57   %   1. Guler, Osman. Foundations of Optimization. New York, Springer,
  58   %   2010.
  59   %
  60
  61   % Set k=0 first, that way the references to xk,rk,zk,dk which
  62   % immediately follow correspond to x0,r0,z0,d0 respectively.
  63   k = 0;
  64
  65   xk = x0;
  66   rk = Q*xk - b;
  67   zk = M \ rk;
  68   dk = -zk;
  69
  70   for k = [ 0 : max_iterations ]
  71     if (norm(rk) < tolerance)
  72        x = xk;
  73        return;
  74     end
  75
  76     % Used twice, avoid recomputation.
  77     rkzk = rk' * zk;
  78
  79     % The term alpha_k*dk appears twice, but so does Q*dk. We can't
  80     % do them both, so we precompute the more expensive operation.
  81     Qdk = Q * dk;
  82
  83     alpha_k = rkzk/(dk' * Qdk);
  84     x_next = xk + (alpha_k * dk);
  85     r_next = rk + (alpha_k * Qdk);
  86     z_next = M \ r_next;
  87     beta_next = (r_next' * z_next)/rkzk;
  88     d_next = -z_next + beta_next*dk;
  89
  90     k = k + 1;
  91     xk = x_next;
  92     rk = r_next;
  93     zk = z_next;
  94     dk = d_next;
  95   end
  96
  97   x = xk;
  98 end