Curve Fitting Sample
The Extreme Optimization Numerica Libraries for .NET makes it very
easy to fit data to arbitrary curves.
The Mathematics of Curve Fitting
Linear Least Squares
How to use the program
The sample code
Representation of Data Points
Computing the Least Squares Fit
The mathematics of Curve Fitting
Linear least squares
Curve fitting is the process of finding the curve that best approximates a set
of points from within a set of curves. The least squares method does this by
minimizing the sum of the squares of the differences between the actual and
predicted values. The linear least squares method, which is used here,
restricts the set of curves to linear combinations of a set of basis functions.
Linear least squares problems can be solved using standard methods of linear
algebra. In mathematical terms, the linear least squares problem corresponds to
minimizing the length of a vector
Ax - b
where A is a matrix containing the values of the basis functions at the
x-coordinates of the data points. b is a vector containing the
y-values of the data points. x is a vector containing the unknown
coefficients of the basis function in the best fit linear combination.
This problem can be solved in a variety of ways. The simplest is by solving the
ATAx = ATb
This is a system of simultaneous linear equations. It can be shown that the
solution to this equation is also the solution to the least squares problem.
Any set of functions can be used as basis functions. In practice, however,
certain common functions are used. Often, the basis functions are chosen to
reflect properties of the relationship that is being investigated. The unknown
parameters of the model describing that relationship correspond to the
coefficients of the least squares fit.
The simplest of these is a constant function. The resulting 'fit' is simply the
mean of the y-values of the data points.
To fit a straight line through a set of points, the basis functions are a
constant function and the function f(x) = x. The result is a linear
model of the form y = ax + b.
To fit a polynomial, the basis functions are the 'monomials' 1, x, x2,
x3, and so on, up to a certain degree. Polynomials are
often used because they have such a simple form.
Instead of using monomials, Chebyshev polynomials can also be used as basis
functions for polynomial fitting. Chebyshev polynomials are a special kind of
polynomial with some very desirable properties.
They are mutually orthogonal, which means that calculations should be more
accurate as round-off error is reduced.
They also oscillate very evenly, which usually results in decreasing
coefficients as the degree of the polynomial increases. With ordinary
polynomial fits, the coefficients often show wild oscillations, further
decreasing their accuracy.
Mathematically, curve fitting with ordinary polynomials and with Chebyshev
polynomials produce exactly the same result. In practice, however, the
Chebyshev method is clearly superior.
How To Use The Program
On startup, the program window shows a blank graph on the left and a tabbed
input/output panel on the right.
Clicking anywhere within the graph area selects a new data point, marked by a
black dot. You can select up to 20 data points. The Reset button clears all
The input panel lets you select which type of curve you want to fit to the data
points. The options are:
||A horizontal line.
||A line with arbitrary slope.
||A quadratic curve of the form y = ax2 + bx + c
||A polynomial of the specified degree.
||A combination of Chebyshev polynomials up to the specified degree.
||A set of functions.
When Polynomial or ChebyshevSeries is chosen, you must specify the degree of the
approximation. The default is 4. Note that both these options yield essentially
the same curve, but in a different representation.
When Custom is selected, a list of combo boxes appears that lets you choose up
to five functions from a list:
The functions available are: constant, identity (x), sin x,
cos x, exp x (ex), erfc x (the
complimentary error function), and J0(x) (the
ordinary Bessel function of the first kind of order 0).
Pressing the 'Fit' button calculates the fit. The best fit curve is drawn on the
graph area as a purple line. The coefficients are available from the Output
We pointed out earlier that polynomials and Chebyshev expansions produce the
same mathematical curve but Chebyshev expansions are more desirable
numerically. To illustrate this point, the table below lists the coefficients
of the 8th degree polynomial and Chebyshev expansions for a set of data points:
As you can see, the polynomial coefficients take on large values, up to 100
times the actual size of the result. Inevitably, two or more digits of
precision will be lost in the calculation. By contrast, the coefficients of the
Chebyshev series are all of the same order. In addition, it turns out that it
is impossible to calculate higher degree polynomials due to round-off error.
Trying to fit the same data set to a 12th degree polynomial results in the
following error message:
The Chebyshev expansion poses no such problem up to and including the maximum
degree of 19.
The Sample Code
A complete code walkthrough would be beyond the purpose of this document. We
will cover the numerical aspects only.
Representation of data points
The data points are kept in two
y When the user clicks in the chart area, the screen coordinates are
transformed to values in the range [0, 5]. The
keeps track of the total number of data points.
In computer science, the term function usually refers to a subroutine
that may take some arguments and returns a value. The concept was derived from
the mathematical function, which has the same meaning at its core. The
major difference between the two concepts is that in computer science, what
matters is to produce the result in the fastest and most economical way
possible, whereas in mathematics, the function's properties and relationships
to other functions are more important.
Centuries of study of the properties of mathematical functions have yielded many
derived and related properties that have no parallel in computer science.
Derivatives and integrals are perhaps the most common examples. Functions or
methods in the Common Language Runtime aren't suited to represent mathematical
functions together with their properties. Mathematical functions are more like
objects than they are methods of an object or class.
In the Extreme Mathematics Library for .NET, the
Curve class and its descendants represent mathematical
functions. The name Curve was chosen over the more obvious Function
because the latter is a reserved word in Visual Basic .NET and other languages.
Curve classes come with methods to
evaluate the function, its
integral, and to
find the roots or zeroes of a curve.
Curves also have parameters that determine the exact shape of the curve. For
instance, a line
has both a slope and a y intercept (the point where it crosses the y-axis). All
operations on lines can be expressed in terms of these two parameters, and any
set of parameters yields a different line.
The algorithmic aspect of evaluating mathematical functions is encapsulated in a
series of delegates with various signatures. For example, a
RealFunction delegate essentially contains a reference to a
method that takes one real (Double) argument and returns a real number. When
you only need to evaluate a mathematical function, and don't need the rich
functionality that comes with the Curve class, use a
Computing the Least Squares fit
Curve fitting is the process of finding the curve that best approximates a set
of points from within a set of curves which are linear combinations of a set of
basis functions. The
FunctionBasis class represents such a set of basis
Computing the least squares fit is done in two steps:
of the appropriate type is created to respresent the set of basis functions.
LeastSquaresFit method is called on the
object. This creates a
LinearCombination object, which represents a curve that is a
linear combination of the basis functions.
Specialized classes that inherit from
FunctionBasis exist for polynomials, Chebyshev expansions,
and collections of arbitrary functions.
class represents a set of monomials, which form a basis for polynomials up to a
specified degree. Constants, lines, and quadratic curves are special cases of
polynomials of degree 0, 1, and 2, respectively. The global variable
is set to this value in the
SelectedIndexChanged event handler of
cboCurveType combo box. For general polynomials, the degree is
fetched from the textbox and validated. This value is then passed on to the
constructor for the
PolynomialBasis class as follows:
_basis = new PolynomialBasis(_degree);
class represents a set of Chebyshev polynomials up to a specified degree.
Chebyshev polynomials only have their excellent properties over the interval
[-1, +1]. The domain of our fit runs from 0 to 5. Fortunately, the
ChebyshevBasis class can rescale the Chebyshev polynomials
so that they retain these properties over any chosen interval. We must pass the
interval to the constructor. The degree is once again fetched from the textbox
_basis = new ChebyshevBasis(0, 5, _degree);
For custom basis functions, we first create an array of
RealFunction delegates containing the basis functions. The
elements of the
_customFunctions array are set by the
handler of the combo boxes. This handler also keeps track of the
variable. To get the array of basis functions, we simply iterate through the
array and fill our basis with the elements that are not null. We then pass this
array on to the
RealFunction basisFunctions =
Int32 current = 0;
for(Int32 index = 0; index < 5; index++)
if (_customFunctions[index] != null)
basisFunctions[current++] = _customFunctions[index];
_basis = new GeneralFunctionBasis(basisFunctions);
The actual curve fitting is performed by the
LeastSquaresFit method of the
FunctionBasis object. The coefficients are retrieved through
method of the resulting
_fit = _basis.LeastSquaresFit(vx, vy);
listParameters.DataSource = _fit.GetParameters();
Copyright © 2003-2015, Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual Studio.NET, and the Optimized for Visual Studio logo
are registered trademarks of Microsoft Corporation.