MATLAB Templates

The templates are ready for immediate use or as a departure
point for generalization,^{}
e.g. problems over multiple variables with
orthogonality constraints, or code optimizations.
In the simplest mode, the
user needs only to supply a function to minimize (and first and
second derivatives, optionally) in `F.m`

(in `dF.m`

and
`ddF.m`

) and an initial guess `Y0`

. The calling sequence is
then a single call to `sg_min`

(named in honor
of Stiefel and Grassmann):

[fopt, yopt] = sg_min(Y0).

For example, if the function `F.m`

has the form

function f=F(Y) f=trace( Y' * diag(1:10) * Y * diag(1:3) );we can call

`sg_min(rand(10,3))`

, which specifies a random
starting point.
We strongly recommend also providing first derivative information:

function df=dF(Y) df = 2 * diag(1:10) * Y * diag(1:3);The code can do finite differences, but it is very slow and problematic. Second derivative information can also be provided by the user (this is not nearly as crucial for speed as providing first derivative information, but may improve accuracy):

function ddf=ddF(Y,H) ddf = 2 * diag(1:10) * H * diag(1:3);A sample test call where is known to have optimal value is

>> rand('state',0); % initialize random number generator >> fmin = sg_min(rand(10,3)) iter grad F(Y) flops step type 0 1.877773e+01 3.132748e+01 2470 none invdgrad: Hessian not positive definite, CG terminating early 1 1.342879e+01 2.011465e+01 163639 Newton step invdgrad: Hessian not positive definite, CG terminating early 2 1.130915e+01 1.368044e+01 344198 Newton step invdgrad: Hessian not positive definite, CG terminating early 3 5.974819e+00 1.063045e+01 515919 Newton step invdgrad: max iterations reached inverting the Hessian by CG 4 1.135417e+00 1.006835e+01 789520 Newton step 5 5.526359e-02 1.000009e+01 1049594 Newton step 6 5.072118e-05 1.000000e+01 1337540 Newton step 7 1.276025e-06 1.000000e+01 1762455 Newton step fmin = 10.0000

The full calling sequence to `sg_min`

is

[fopt, yopt]=sg_min(Y0,rc,mode,metric,motion,verbose,... gradtol,ftol,partition),where

`Y0`

is required and the optional arguments are
(see Table 9.1):

Argument | Description |

rc | {'real', 'complex' } |

mode | {'newton', 'dog', 'prcg', 'frcg' } |

metric | {'flat', 'euclidean', 'canonical' } |

motion | {'approximate', 'exact' } |

verbose | {'verbose', 'quiet'} |

ftol | First convergence tolerance |

gradtol | Second convergence tolerance |

partition | Partition of indicating symmetries of |

`rc`- specifies whether the matrices will have complex
entries or not. Although most of the code is insensitive to this
issue,
`rc`

is vital for counting the dimension of the problem correctly. When omitted,`sg_min`

guesses are based on whether`Y0`

has nonzero imaginary part. `mode`- selects the optimization
method that will be used. 'newton' selects Newton's method with a conjugate
gradient-based inversion of the Hessian. 'dog' selects a dog-leg step
algorithm which interpolates a steepest descent and a Newton's method
step. 'frcg' selects a Fletcher-Reeves conjugate
gradient. Lastly, 'prcg' selects a Polak-Ribiere conjugate gradient,
which has the advantage that it does not require a very accurate
Hessian (and thus it is the safest of these methods if one uses a finite
difference approximation to implement
`ddF.m`

). While 'newton' is the default, this is by no means our endorsement of it over the other methods, which might be more accurate and less expensive for some problems. `metric`- selects the kind of
geometry with which to endow the constraint surface. This ultimately
affects the definition of the covariant Hessian. 'flat' projects
the result of applying the unconstrained Hessian onto the tangent
space, while 'euclidean' and 'canonical' add on
*connection*terms specific to their geometries. 'euclidean' is the default. `motion`- selects whether line
movement along the manifold will be by the analytic solution to the
geodesic equations of motions for the metric selected or by a
computationally less expensive approximation to the solution (default).
(For a 'flat' metric, there is no geodesic equation, so this argument
has no effect in that case.)
`verbose`- determines whether the
function will display reports on each iteration while the function
executes. When this argument is 'verbose' (the default) data will be
displayed and also recorded in the global
`SGdata`

. When this argument is 'quiet' then no convergence data is displayed or recorded. `gradtol`and`ftol`- We declare convergence if
both of two conditions are true:
`grad/gradinit < gradtol`

(default`1e-7`

) or`(f-fold)/f < ftol`

(default`1e-10`

), where`gradinit`

is the initial magnitude of the gradient and`fold`

is the value of at the last iteration. `partition`- is a cell array
whose elements are vectors of indices that represent
a disjoint partition of
`1:p`

. If has no symmetry, then the partition is`num2cell(1:p)`

. If for all orthogonal , then the partition is`{1:p}`

. The partition is`{1:2,3:5}`

if the symmetry in is for orthogonal with sparsity structure

The user could equally well specify`{3:5,1:2}`

or`{[5 3 4],[2 1]}`

for the partition; i.e., a partition is a set of sets. The purpose of the partition is to project away the null space of the Hessian resulting from any block-diagonal orthogonal symmetries of . If the argument is omitted, our code will pick a partition by determining the symmetries of (using its`partition.m`

function). However, the user should be aware that a wrong partition can destroy convergence.