Extreme Optimization >
User's Guide >
Mathematics Library >
Optimization >
Multidimensional Optimization
Extreme Optimization User's Guide
User's Guide
Up: Optimization Next: Linear Programming Previous: One-Dimensional Optimization Contents
Multi-Dimensional Optimization
Finding the minimum or maximum of a function in many variables is one of the most common problems
in numerical computing. It is not surprising that a wide range of techniques exist, each with its advantages
and disadvantages. The Extreme Optimization Mathematics Library for .NET implements
a large number of these techniques, to offer solutions for the broadest possible range of applications.
Available Algorithms
The Downhill Simplex Method of Nelder and Mead
The Nelder-Mead algorithm, often also called the downhill simplex method, is a simple algorithm that
produces reasonable results when no derivatives are available.
A simplex is a generalization of a triangle (2 dimensional) or tetrahedron (3 dimensional) to arbitrary
dimensions. A simplex in n dimensions has n+1 vertices. This is the smallest number of points
needed to define a region in n-dimensional space.
The algorithm works by letting this simplex 'travel' downhill on the surface defined by the objective function through
reflection. If no lower point can be found, the simplex is contracted towards the lowest point.
The algorithm defines its own versions of the
ValueTest
and SolutionTest
properties. The value test succeeds if the difference of the objective function at the best and at the worst point in the simplex
is less than the tolerance. The solution test succeeds if the distance between the best and the worst point in the simplex
is less than the tolerance. The gradient test does not apply.
Convergence of the Nelder-Mead method is linear at best. It can fail in certain pathological situations.
If derivatives are available, and the derivatives are continuous, then one of the other methods should be used.
Conjugate Gradient Optimizers
Two algorithms are available that search successively in a direction from a well-defined set of directions.
For each direction, a one-dimensional search for an extremum is performed along the direction.
The conjugate gradient method, implemented by the
ConjugateGradientOptimizer
class, uses a series of search directions that are optimal for a quadratic objective function.
The method requires that the gradient of the objective function be available.
There are three common variants of the formula to generate a new search direction, giving rise to the Fletcher-Reeves
and Polak-Ribière methods. The variant is selected by passing the appropriate ConjugateGradientMethod
to the constructor, or setting the Method
property.
It is found that, in general, the Polak-Ribière method tends to converge somewhat faster.
In addition, the positive variant of this method has proven convergence properties.
It is the default method.
Powell's method, implemented by the
PowellOptimizer
class, is a variation of the conjugate gradient method that does not require knowledge of the gradient of the objective
function. Instead, the directions are generated from the geometrical properties of the successive approximations.
A drawback of Powell's method is that it usually requires more function evaluations, as the line searches need to be exact
for the method to converge.
Quasi-Newton Methods
The full Newton method requires the calculation of second derivatives
and the solution of a system of equations in every iteration. This can be very expensive and may not lead to great results,
especially when the current point is far from the actual extremum.
Quasi-Newton methods use a numerical approximation to the inverse of the Hessian matrix that is maintained
through each iteration. Different variations of the method use different ways to keep the approximation up to date.
There are two main variations: the Davidon-Fletcher-Powell method (commonly abbreviated to DFP)
and the Broyden-Fletcher-Goldfard-Shanno method (BFGS).
The method is selected by passing the appropriate
QuasiNewtonMethod
to the constructor, or setting the Method
property. The BFGS method is usually somewhat superior to the DFP method. It is the default.
Defining the objective function
The function for which an extremum is to be found is called the objective function.
It is accessed through the ObjectiveFunction
property, which is a delegate of type MultivariateRealFunction.
This is a method that takes a Vector
argument and returns a real number.
The following code demonstrates how to write methods to implement objective functions. The objective function
implemented here is the Rosenbrock function,
f(x1,x2) = (1 - x1)2 + 105(x2-x12)2
which is a famous test function for optimization.
double Rosenbrock(Vector x)
{
double a = 1 - x[0];
double b = (x[1] - x[0]*x[0]);
return a*a + 105*b*b;
}
Function Rosenbrock(x As Vector) As Double
Dim a As Double = 1 - x(0)
Dim b As Double = (x(1) - x(0)*x(0))
Return a*a + 105*b*b
End Function
|
| C# | VB.NET | |
Some algorithms require the gradient (vector of partial derivatives) of the objective function.
There are two ways to supply this function. The first is through the
GradientFunction
property. This property is of type MultivariateVectorFunction.
It is a method that takes a Vector
argument and also returns a Vector.
The drawback of using this property is that a new Vector instance is created on every call.
The following code demonstrates how to write a method that calculates the gradient of an objective function.
Once again, we use the Rosenbrock function as our example:
Vector RosenbrockGradient(Vector x)
{
Vector f = new GeneralVector(2);
double a = 1 - x[0];
double b = (x[1] - x[0]*x[0]);
f[0] = -2*a - 420*x[0]*b;
f[1] = 210*b;
return f;
}
Function RosenbrockGradient(x As Vector) As Vector
Dim f As Vector = new GeneralVector(1)
Dim a As Double = 1 - x(0)
Dim b As Double = (x(1) - x(0)*x(0))
f(0) = -2*a - 420*x(0)*b
f(1) = 210*b
Return f
End Function
|
| C# | VB.NET | |
Alternatively, the gradient can be specified using the
FastGradientFunction
property. This property is of type FastMultivariateVectorFunction.
It is a method that takes two Vector
arguments. The first is the input vector. The second is the Vector that is to hold the result.
The return value is a reference to this result vector.
The following code demonstrates how to write a method that calculates the gradient of an objective function
as a FastMultivariateVectorFunction.
Vector RosenbrockFastGradient(Vector x, Vector f)
{
if (f == null)
f = new GeneralVector(2);
double a = 1 - x[0];
double b = (x[1] - x[0]*x[0]);
f[0] = -2*a - 420*x[0]*b;
f[1] = 210*b;
return f;
}
Function RosenbrockFastGradient(x As Vector, f As Vector) As Vector
If (f Is Nothing) Then
f = new GeneralVector(2)
End If
Dim a As Double = 1 - x(0)
Dim b As Double = (x(1) - x(0)*x(0))
f(0) = -2*a - 420*x(0)*b
f(1) = 210*b
Return f
End Function
|
| C# | VB.NET | |
If the gradient function is not available, or if it is very expensive to calculate, it may be preferable
to either approximate it numerically, or use a method that does not require the gradient function.
The next example puts this all together. It creates a BFGS optimizer and sets it up to minimize the Rosenbrock function:
QuasiNewtonOptimizer bfgs =
new QuasiNewtonOptimizer(QuasiNewtonMethod.Bfgs);
bfgs.ObjectiveFunction = new MultivariateRealFunction(Rosenbrock);
bfgs.FastGradientFunction =
new FastMultivariateVectorFunction(RosenbrockFastGradient);
Dim bfgs As QuasiNewtonOptimizer = _
New QuasiNewtonOptimizer(QuasiNewtonMethod.Bfgs)
bfgs.ObjectiveFunction = New MultivariateRealFunction(AddressOf Rosenbrock)
bfgs.FastGradientFunction = _
New FastMultivariateVectorFunction(AddressOf RosenbrockFastGradient)
|
| C# | VB.NET | |
Termination Criteria
Optimization algorithms in multiple dimensions commonly have two or three criteria that may each signal
that an extremum has been found.
The ValueTest
property returns a SimpleConvergenceTest
object that represents the convergence test based on the value of the objective function. The test is successful
if the change in the value of the objective function is less than the tolerance. The test returns a value of
Divergent if the value of the objective function is infinite, and BadFunction
when the value is NaN.
There is some danger in using this test, since some algorithms may not return a new best estimate for the extremum
on each iteration. For this reason, the test is not active by default. To make it active, its
Active
property must be set to true.
The SolutionTest
property returns a VectorConvergenceTest
object that represents the convergence test based on the estimated extremum. The test is successful
if the change in the approximate extremum is less than the tolerance. The test returns a value of
Divergent if one of the components of the solution is infinite, and BadFunction
when one of the values is NaN.
The VectorConvergenceTest has many options to specify the details of the convergence test.
By default, the error is calculated as the maximum of the errors in each of the components of the change of the extremum
from the previous iteration compared to the corresponding components of the extremum.
Methods that use gradients have a third test available. The GradientTest
property returns a VectorConvergenceTest
based on the gradient of the objective function.
The test succeeds if the norm of the gradient vector is less than the tolerance.
This test is inactive for algorithms that do not use gradients.
Specific algorithms may supply their own convergence tests, or may modify the exact meaning of the tests
discussed here. For example, the Nelder-Mead method uses slight variations of the value and solution tests
that are more convenient for that particular algorithm.
The example below illustrates some of the options mentioned here. It sets the tolerance of the gradient test
to 100 times the machine precision,
which will succeed if the sum of the absolute values is less than the tolerance. It also explicitly specifies that the value test
is not to be used.
bfgs.ValueTest.Active = false;
bfgs.GradientTest.Norm = VectorConvergenceNorm.SumOfAbsoluteValues;
bfgs.GradientTest.Tolerance = 100 * MachineConstants.Epsilon;
bfgs.ValueTest.Active = False
bfgs.GradientTest.Norm = VectorConvergenceNorm.SumOfAbsoluteValues;
bfgs.GradientTest.Tolerance = 100 * MachineConstants.Epsilon;
|
| C# | VB.NET | |
Line searches
Many algorithms for Multidimensional optimization work by successively solving a one-dimensional optimization
problem along a search direction, and altering the search direction in each iteration.
These algorithms all inherit from
DirectionalOptimizer.
Although general one-dimensional optimization algorithms may be used, they usually don't work well. They are either
wasteful because they are too precise, or the results fail to meet certain criteria that are required to ensure convergence
of the algorithm.
Specialized line search algorithms are therefore in order. The Extreme Optimization Mathematics Library for .NET
provides three of them:
| Class | Description |
|
MoreThuenteLineSearch |
The classic algorithm of More and Thuente. |
|
ParabolicLineSearch |
A line search that uses quadratic interpolation. |
|
UnitLineSearch |
A line "search" that always returns a unit step. |
All optimization algorithms that use line searches automatically select the appropriate line search algorithm,
and set the search parameters to appropriate values. The unit line search is useful for experimental purposes,
or when the objective function is known to be well behaved so that a line search is not necessary.
You can use the
LineSearch property to access the line search object.
Computing the Extremum
You must specify an initial guess for the solution by setting the
InitialGuess property to a suitable
Vector value. Any vector type may be used.
The
FindExtremum method finds the extremum. It returns a
GeneralVector that contains the best approximation to the extremum. This value is also available
through the
Extremum
bfgs.InitialGuess = new GeneralVector(-1.2, 1.0);
bfgs.ExtremumType = ExtremumType.Minimum;
bfgs.FindExtremum();
bfgs.InitialGuess = New GeneralVector(-1.2, 1.0)
bfgs.ExtremumType = ExtremumType.Minimum
bfgs.FindExtremum()
|
| C# | VB.NET | |
Interpreting Results
After the FindExtremum method returns, a number of properties give more details about
the algorithm. The
Status property returns an
AlgorithmStatus
value that indicates how the algorithm terminated. The possible values are listed below:
| Value |
Description |
NoResult |
The algorithm has not been executed. |
Busy |
The algorithm is still executing. |
Converged |
The algorithm ended normally. The desired accuracy has been achieved. |
IterationLimitExceeded |
The number of iterations needed to achieve the desired accuracy is greater than
MaxIterations. |
EvaluationLimitExceeded |
The number of function evaluations needed to achieve the desired accuracy is greater than
MaxEvaluations. |
RoundOffError |
Round-off prevented the algorithm from achieving the desired accuracy. |
BadFunction |
Bad behavior of the target function prevented the algorithm from achieving the
desired accuracy. |
Divergent |
The algorithm diverges. The objective function appears to be unbounded. |
Table 1. AlgorithmStatus values.
Each of the ConvergeTest objects discussed earlier exposes an
Error property that returns the estimated error used in the test.
The
IterationsNeeded and
EvaluationsNeeded properties return the number of iterations and the number of evaluations
of the objective function, respectively. Algorithms that use gradients also have a
GradientEvaluationsNeeded.
The final example prints a summary of the results of running the BFGS algorithm on the Rosenbrock function:
Console.WriteLine("BFGS Method:");
Console.WriteLine(" Solution: {0}", bfgs.Extremum);
Console.WriteLine(" Estimated error: {0}", bfgs.SolutionTest.Error);
Console.WriteLine(" Gradient residual: {0}", bfgs.GradientTest.Error);
Console.WriteLine(" # iterations: {0}", bfgs.IterationsNeeded);
Console.WriteLine(" # function evaluations: {0}",
bfgs.EvaluationsNeeded);
Console.WriteLine(" # gradient evaluations: {0}",
bfgs.GradientEvaluationsNeeded);
Console.WriteLine("BFGS Method:")
Console.WriteLine(" Solution: {0}", bfgs.Extremum)
Console.WriteLine(" Estimated error: {0}", bfgs.SolutionTest.Error)
Console.WriteLine(" Gradient residual: {0}", bfgs.GradientTest.Error)
Console.WriteLine(" # iterations: {0}", bfgs.IterationsNeeded)
Console.WriteLine(" # function evaluations: {0}", _
bfgs.EvaluationsNeeded)
Console.WriteLine(" # gradient evaluations: {0}", _
bfgs.GradientEvaluationsNeeded)
|
| C# | VB.NET | |
References
Fletcher, R. Practical Methods of Optimization, John Wiley & Sons, New York, 1987.
Fletcher, R. and Reeves, C.M., Function Minimization by Conjugate Gradients,
Comp, J. 7, 149-154, 1964.
Moré and D. Thuente, Line Search Algorithms with
Guaranteed Sufficient Decrease, ACM Transactions on Mathematical Software , Vol. 20, No. 3, pp. 286--307, 1994.
Nelder, J. A. and Mead, R. "A Simplex Method for Function Minimization."
Comput. J. 7, 308-313, 1965.
Polak, E., and Ribière, G., Note sur la Convergence
de Methodes de Directions Conjugees, Revue Francaise d'Informatique et de
Recherche Operationnelle, 3, 35-43, 1969.
Powell, M.J.D., An Efficient Method for Finding the Minimum of a Function of
Several Variables Without Calculating Derivatives, Comp. J. 7, 155-162, 1964.
Up: Optimization Next: Linear Programming Previous: One-Dimensional Optimization Contents
Copyright 2004-2008,
Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M
Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual
Studio.NET, and the Visual Studio Logo are registered trademarks of Microsoft Corporation