Many optimization problems take the form of minimizing
a sum of squares of a set of functions. Specialized algorithms
have been developed to take advantage of the structure
of such problems.
The Levenberg-Marquardt algorithm
The classic algorithm for solving nonlinear least squares was first developed
by Kenneth Levenberg and Donald Marquardt, and put into its current form
by Jorge Moré.
It is a trust region method.
A quadratic approximation to the objective function is used
in a region around the current point called the trust region.
If the approximation is good, then the trust region is expanded;
otherwise it is contracted.
Trust Region Reflexive Method
The trust region reflexive algorithm is an enhancement of the
Levenberg-Marquardt. The basic idea is to adjust the shape
of the trust region so it fits within the bounds.
Furthermore, when the proposed step is out of bounds,
a search direction reflected from the bound is considered.
Because of these enhancements, this algorithm tends to work
somewhat better than the Levenberg-Marquardt algorithm
for constrained problems.
Defining the objective function
The function for which an extremum is to be found is called the
objective function. In this case, we are minimizing
the sum of the squares of a set of functions.
Or, equivalently, we are minimizing the norm of a vector function.
This vector function is the objective function for least squares optimizers.
The function is accessed through the
ObjectiveFunction
property, which is a delegate of type FuncT1, T2, TResult.
The first argument is a VectorT
that contains the unknowns. The second argument is a vector that is to hold the result.
This is also the return value, unless the second argument is null, in which case
a new vector should be created and returned.
The following code demonstrates how to write methods to implement objective functions.
The Rosenbrock function,
f(x1,x2) = (1 - x1)2 +
105(x2-x12)2,
is a famous test function for optimization. It is the sum
of the squares of two functions, and so can be minimized using
a least squares optimizer. The objective function would be written as:
Vector<double> Rosenbrock(Vector<double> x, Vector<double> f)
{
if (f == null)
f = Vector.Create<double>(2);
f[0] = 1 - x[0];
f[1] = Math.Sqrt(105) * (x[1] - x[0] * x[0]);
return f;
}
Function Rosenbrock(x As Vector(Of Double), f As Vector(Of Double)) As Vector(Of Double)
If f Is Nothing Then f = Vector.Create(Of Double)(2)
f(0) = 1 - x(0)
f(1) = Math.Sqrt(105) * (x(1) - x(0) * x(0))
Return f
End Function
No code example is currently available or this language may not be supported.
let Rosenbrock (x : Vector<float>) (f : Vector<float>) =
let f = if (f = null) then Vector.Create<float>(2) :> Vector<float> else f
f.[0] <- 1.0 - x.[0]
f.[1] <- sqrt(105.0) * (x.[1] - x.[0] * x.[0])
f
The least squares optimizers also require the Jacobian of the objective function.
This is a matrix whose rows contain the gradient (vector of partial derivatives)
of each corresponding component of the objective function.
The JacobianFunction
property is a FuncT1, T2, TResult delegate,
with a definition similar to the objective function. The first argument is
a VectorT.
The second argument is a MatrixT
that is to hold the result. The return value is also a reference to this matrix.
The following code demonstrates how to write a method that calculates
the Jacobian of an objective function. Once again,
we use the Rosenbrock function as our example:
Matrix<double> RosenbrockJacobian(Vector<double> x, Matrix<double> J)
{
if (J == null) J = Matrix.Create<double>(2, 2);
J[0, 0] = -1;
J[0, 1] = 0;
J[1, 0] = Math.Sqrt(105);
J[1, 1] = -2 * Math.Sqrt(105) * x[0];
return J;
}
Function RosenbrockJacobian(x As Vector(Of Double), J As Matrix(Of Double)) As Matrix(Of Double)
If (J Is Nothing) Then J = Matrix.Create(Of Double)(2, 2)
J(0, 0) = -1
J(0, 1) = 0
J(1, 0) = Math.Sqrt(105)
J(1, 1) = -2 * Math.Sqrt(105) * x(0)
Return J
End Function
No code example is currently available or this language may not be supported.
let RosenbrockJacobian(x : Vector<float>) (J : Matrix<float>) =
let J = if (J = null) then Matrix.Create<float>(2, 2) :> Matrix<float> else J
J.[0, 0] <- -1.0
J.[0, 1] <- 0.0
J.[1, 0] <- Math.Sqrt(105.0)
J.[1, 1] <- -2.0 * Math.Sqrt(105.0) * x.[0]
J
If the Jacobian is not available, or if it is very expensive to calculate,
it may be preferable to either approximate it numerically, or use a method
that does not require the gradient function.
When using .NET 4.0,
it is possible to have the Jacobian computed using Automatic Differentiation.
To do so, specify the objective function by calling the
SetSymbolicObjectiveFunction(IListExpressionFuncVectorDouble, Double)
method. The argument can be a list of lambda expressions that map
a VectorT
to its value, or a parameter array of such expressions.
The next example puts this all together. It creates a Levenberg-Marquardt
optimizer and sets it up to minimize the Rosenbrock
function:
LeastSquaresOptimizer lm = new LevenbergMarquardtOptimizer();
lm.ObjectiveFunction = Rosenbrock;
lm.JacobianFunction = RosenbrockJacobian;
Dim lm = New LevenbergMarquardtOptimizer()
lm.ObjectiveFunction = AddressOf Rosenbrock
lm.JacobianFunction = AddressOf RosenbrockJacobian
No code example is currently available or this language may not be supported.
let lm = LevenbergMarquardtOptimizer()
lm.ObjectiveFunction <- Func<_,_,_>(Rosenbrock)
lm.JacobianFunction <- Func<_,_,_>(RosenbrockJacobian)
Once the algorithm is set up, the minimum can be calculated.
There are two more issues that must be dealt with.
Defining the initial guess
In all cases, a starting point for the algorithm is required.
The InitialGuess
property must be set to a suitable value.
Once this is done, the
FindMinimum
method ca be called to perform the minimization:
lm.InitialGuess = Vector.Create(-1.2, 1.0);
lm.FindMinimum();
lm.InitialGuess = Vector.Create(-1.2, 1.0)
lm.FindMinimum()
No code example is currently available or this language may not be supported.
lm.InitialGuess <- Vector.Create(-1.2, 1.0)
let x = lm.FindMinimum
Defining bounds on the variables
LowerBounds and
UpperBounds
properties let you set bounds on the variables.
These are vectors that contain the bounds for each variable.
If a variable is unbounded, then a value of
NegativeInfinity
should be used for the lower bound, and
PositiveInfinity
for the upper bound.
The example below sets up a Trust Region Reflective optimizer for the Rosenbrock
function, and sets bounds on the variables:
LeastSquaresOptimizer trr = new TrustRegionReflectiveOptimizer();
trr.ObjectiveFunction = Rosenbrock;
trr.JacobianFunction = RosenbrockJacobian;
trr.InitialGuess = Vector.Create(-1.2, 1.0);
trr.LowerBounds = Vector.Create(-2.0, -2.0);
trr.UpperBounds = Vector.Create(0.5, 2.0);
trr.FindMinimum();
Dim trr = New TrustRegionReflectiveOptimizer()
trr.ObjectiveFunction = AddressOf Rosenbrock
trr.JacobianFunction = AddressOf RosenbrockJacobian
trr.InitialGuess = Vector.Create(-1.2, 1.0)
trr.LowerBounds = Vector.Create(-2.0, -2.0)
trr.UpperBounds = Vector.Create(0.5, 2.0)
trr.FindMinimum()
No code example is currently available or this language may not be supported.
let trr = TrustRegionReflectiveOptimizer()
trr.ObjectiveFunction <- Func<_,_,_>(Rosenbrock)
trr.JacobianFunction <- Func<_,_,_>(RosenbrockJacobian)
trr.InitialGuess <- Vector.Create(-1.2, 1.0)
trr.LowerBounds <- Vector.Create(-2.0, -2.0)
trr.UpperBounds <- Vector.Create(0.5, 2.0)
let x = trr.FindMinimum
Optimization algorithms in multiple dimensions commonly have two or three criteria
that may each signal that an extremum has been found.
The ValueTest property
returns a SimpleConvergenceTest object that represents
the convergence test based on the value of the objective function. The test is successful if the change in the value
of the objective function is less than the tolerance. The test returns a value of Divergent if the value
of the objective function is infinite, and BadFunction when the value is NaN.
There is some danger in using this test, since some algorithms may not return a new best estimate for the extremum
on each iteration. For this reason, the test is not active by default. To make it active, its Enabled property must be set to true.
The SolutionTest
property returns a VectorConvergenceTestT object that
represents the convergence test based on the estimated extremum. The test is successful if the change in the
approximate extremum is less than the tolerance. The test returns a value of Divergent if one of the
elements of the solution is infinite, and BadFunction when one of the values is NaN.
The VectorConvergenceTest has many options to specify
the details of the convergence test. By default, the error is calculated as
the maximum of the errors in each of the elements of the change of the extremum
from the previous iteration compared to the corresponding elements of the extremum.
The GradientTest property returns a
VectorConvergenceTestT based on the gradient of the
objective function. The test succeeds if the norm of the gradient vector is less than the tolerance. This test is
inactive for algorithms that do not use gradients.
After the FindMinimum method returns, a number
of properties give more details about the algorithm. The
Status property returns an AlgorithmStatus
value that indicates how the algorithm terminated.
The possible values are listed below:
AlgorithmStatus values
Value | Description |
---|
NoResult | The algorithm has not been executed. |
Busy | The algorithm is still executing. |
Converged | The algorithm ended normally. The desired accuracy has been achieved. |
IterationLimitExceeded |
The number of iterations needed to achieve the desired accuracy is greater than MaxIterations.
|
EvaluationLimitExceeded |
The number of function evaluations needed to achieve the desired accuracy is greater than MaxEvaluations.
|
RoundOffError | Round-off prevented the algorithm from achieving the desired accuracy. |
BadFunction | Bad behavior of the target function prevented the algorithm from achieving the desired accuracy. |
Divergent | The algorithm diverges. The objective function appears to be unbounded. |
Each of the ConvergeTest objects discussed earlier exposes an
Error
property that returns the estimated error used in the test.
The IterationsNeeded
and EvaluationsNeeded
properties return the number of iterations and the number of evaluations
of the objective function, respectively. The
JacobianEvaluationsNeeded
property counts the number of times the full Jacobian was evaluated.
The final example prints a summary of the results of running the
Levenberg-Marquardt algorithm on the Rosenbrock function:
Console.WriteLine("Levenberg-Marquardt Method:");
Console.WriteLine(" Solution: {0}", lm.Minimum);
Console.WriteLine(" Estimated error: {0}", lm.SolutionTest.Error);
Console.WriteLine(" Gradient residual: {0}", lm.GradientTest.Error);
Console.WriteLine(" # iterations: {0}", lm.IterationsNeeded);
Console.WriteLine(" # function evaluations: {0}",
lm.EvaluationsNeeded);
Console.WriteLine(" # Jacobian evaluations: {0}",
lm.JacobianEvaluationsNeeded);
Console.WriteLine("Levenberg-Marquardt Method:")
Console.WriteLine(" Solution: {0}", lm.Minimum)
Console.WriteLine(" Estimated error: {0}", lm.SolutionTest.Error)
Console.WriteLine(" Gradient residual: {0}", lm.GradientTest.Error)
Console.WriteLine(" # iterations: {0}", lm.IterationsNeeded)
Console.WriteLine(" # function evaluations: {0}",
lm.EvaluationsNeeded)
Console.WriteLine(" # Jacobian evaluations: {0}",
lm.JacobianEvaluationsNeeded)
No code example is currently available or this language may not be supported.
printfn "Levenberg-Marquardt Method:"
printfn " Solution: %O" lm.Minimum
printfn " Estimated error: %f" lm.SolutionTest.Error
printfn " Gradient residual: %f" lm.GradientTest.Error
printfn " # iterations: %d" lm.IterationsNeeded
printfn " # function evaluations: %d" lm.EvaluationsNeeded
printfn " # Jacobian evaluations: %d" lm.JacobianEvaluationsNeeded
Branch, M.A., T.F. Coleman, and Y. Li,
A Subspace, Interior, and Conjugate Gradient Method
for Large-Scale Bound-Constrained Minimization Problems,
SIAM Journal on Scientific Computing, Vol. 21, Number 1, pp 1-23, 1999.
Levenberg, K. A method for the solution
of certain nonlinear problems in least squares.
Quart. Appl. Math. 2, 164-168, 1944.
Marquardt, D. W. An algorithm for least squares
estimation of nonlinear parameters,
SIAM J. Appl. Math. 11, 431-441, 1963.
Moré, J. J., The Levenberg-Marquardt Algorithm:
Implementation and Theory, Numerical Analysis 630,
pp 105-116, 1978.