Implement conjugate_gradient_method() in terms of the preconditioned CGM.
Add the slow, simple preconditioned CGM as simple_preconditioned_cgm().
%
% min [phi(x) = (1/2)*<Ax,x> + <b,x>]
%
- % using the conjugate_gradient_method (Algorithm 5.2 in Nocedal and
- % Wright).
+ % using the conjugate_gradient_method.
%
% INPUT:
%
%
% All vectors are assumed to be *column* vectors.
%
- zero_vector = zeros(length(x0), 1);
+ n = length(x0);
+ M = eye(n);
- k = 0;
- x = x0; % Eschew the 'k' suffix on 'x' for simplicity.
- rk = A*x - b; % The first residual must be computed the hard way.
- pk = -rk;
-
- for k = [ 0 : max_iterations ]
- if (norm(rk) < tolerance)
- % Success.
- return;
- end
-
- alpha_k = step_length_cgm(rk, A, pk);
- x_next = x + alpha_k*pk;
- r_next = rk + alpha_k*A*pk;
- beta_next = (r_next' * r_next)/(rk' * rk);
- p_next = -r_next + beta_next*pk;
-
- k = k + 1;
- x = x_next;
- rk = r_next;
- pk = p_next;
- end
+ % The standard CGM is equivalent to the preconditioned CGM is you
+ % use the identity matrix as your preconditioner.
+ [x, k] = preconditioned_conjugate_gradient_method(A,
+ M,
+ b,
+ x0,
+ tolerance,
+ max_iterations);
end
--- /dev/null
+function [x, k] = simple_preconditioned_cgm(Q,
+ M,
+ b,
+ x0,
+ tolerance,
+ max_iterations)
+ %
+ % Solve,
+ %
+ % Qx = b
+ %
+ % or equivalently,
+ %
+ % min [phi(x) = (1/2)*<Qx,x> + <b,x>]
+ %
+ % using the preconditioned conjugate gradient method (14.54 in
+ % Guler).
+ %
+ % INPUT:
+ %
+ % - ``Q`` -- The coefficient matrix of the system to solve. Must
+ % be positive definite.
+ %
+ % - ``M`` -- The preconditioning matrix. If the actual matrix used
+ % to precondition ``Q`` is called ``C``, i.e. ``C^(-1) * Q *
+ % C^(-T) == \bar{Q}``, then M=CC^T. Must be symmetric positive-
+ % definite. See for example Golub and Van Loan.
+ %
+ % - ``b`` -- The right-hand-side of the system to solve.
+ %
+ % - ``x0`` -- The starting point for the search.
+ %
+ % - ``tolerance`` -- How close ``Qx`` has to be to ``b`` (in
+ % magnitude) before we stop.
+ %
+ % - ``max_iterations`` -- The maximum number of iterations to
+ % perform.
+ %
+ % OUTPUT:
+ %
+ % - ``x`` - The solution to Qx=b.
+ %
+ % - ``k`` - The ending value of k; that is, the number of
+ % iterations that were performed.
+ %
+ % NOTES:
+ %
+ % All vectors are assumed to be *column* vectors.
+ %
+ % REFERENCES:
+ %
+ % 1. Guler, Osman. Foundations of Optimization. New York, Springer,
+ % 2010.
+ %
+
+ % This isn't great in practice, since the CGM is usually used on
+ % huge sparse systems.
+ Ct = chol(M);
+ C = Ct';
+ C_inv = inv(C);
+ Ct_inv = inv(Ct);
+
+ Q_bar = C_inv * Q * Ct_inv;
+ b_bar = C_inv * b;
+
+ % But it sure is easy.
+ [x_bar, k] = vanilla_cgm(Q_bar, b_bar, x0, tolerance, max_iterations);
+
+ % The solution to Q_bar*x_bar == b_bar is x_bar = Ct*x.
+ x = Ct_inv * x_bar;
+end
--- /dev/null
+function [x, k] = conjugate_gradient_method(A, b, x0, tolerance, max_iterations)
+ %
+ % Solve,
+ %
+ % Ax = b
+ %
+ % or equivalently,
+ %
+ % min [phi(x) = (1/2)*<Ax,x> + <b,x>]
+ %
+ % using the conjugate_gradient_method (Algorithm 5.2 in Nocedal and
+ % Wright).
+ %
+ % INPUT:
+ %
+ % - ``A`` -- The coefficient matrix of the system to solve. Must
+ % be positive definite.
+ %
+ % - ``b`` -- The right-hand-side of the system to solve.
+ %
+ % - ``x0`` -- The starting point for the search.
+ %
+ % - ``tolerance`` -- How close ``Ax`` has to be to ``b`` (in
+ % magnitude) before we stop.
+ %
+ % - ``max_iterations`` -- The maximum number of iterations to perform.
+ %
+ % OUTPUT:
+ %
+ % - ``x`` - The solution to Ax=b.
+ %
+ % - ``k`` - The ending value of k; that is, the number of iterations that
+ % were performed.
+ %
+ % NOTES:
+ %
+ % All vectors are assumed to be *column* vectors.
+ %
+ zero_vector = zeros(length(x0), 1);
+
+ k = 0;
+ x = x0; % Eschew the 'k' suffix on 'x' for simplicity.
+ rk = A*x - b; % The first residual must be computed the hard way.
+ pk = -rk;
+
+ for k = [ 0 : max_iterations ]
+ if (norm(rk) < tolerance)
+ % Success.
+ return;
+ end
+
+ alpha_k = step_length_cgm(rk, A, pk);
+ x_next = x + alpha_k*pk;
+ r_next = rk + alpha_k*A*pk;
+ beta_next = (r_next' * r_next)/(rk' * rk);
+ p_next = -r_next + beta_next*pk;
+
+ k = k + 1;
+ x = x_next;
+ rk = r_next;
+ pk = p_next;
+ end
+end