Sample Applications
Curve Fitting Sample
The Extreme Optimization Numerica Libraries for .NET makes it very
easy to fit data to arbitrary curves.
The Mathematics of Curve Fitting
Linear Least Squares
Basis functions
How to use the program
The sample code
Representation of Data Points
Encapsulating Functions
Computing the Least Squares Fit
The mathematics of Curve Fitting
Linear least squares
Curve fitting is the process of finding the curve that best approximates a set
of points from within a set of curves. The least squares method does this by
minimizing the sum of the squares of the differences between the actual and
predicted values. The linear least squares method, which is used here,
restricts the set of curves to linear combinations of a set of basis functions.
Linear least squares problems can be solved using standard methods of linear
algebra. In mathematical terms, the linear least squares problem corresponds to
minimizing the length of a vector
Ax - b
where A is a matrix containing the values of the basis functions at the
x-coordinates of the data points. b is a vector containing the
y-values of the data points. x is a vector containing the unknown
coefficients of the basis function in the best fit linear combination.
This problem can be solved in a variety of ways. The simplest is by solving the
normal equations:
ATAx = ATb
This is a system of simultaneous linear equations. It can be shown that the
solution to this equation is also the solution to the least squares problem.
Basis functions
Any set of functions can be used as basis functions. In practice, however,
certain common functions are used. Often, the basis functions are chosen to
reflect properties of the relationship that is being investigated. The unknown
parameters of the model describing that relationship correspond to the
coefficients of the least squares fit.
The simplest of these is a constant function. The resulting 'fit' is simply the
mean of the y-values of the data points.
To fit a straight line through a set of points, the basis functions are a
constant function and the function f(x) = x. The result is a linear
model of the form y = ax + b.
To fit a polynomial, the basis functions are the 'monomials' 1, x, x2,
x3, and so on, up to a certain degree. Polynomials are
often used because they have such a simple form.
Instead of using monomials, Chebyshev polynomials can also be used as basis
functions for polynomial fitting. Chebyshev polynomials are a special kind of
polynomial with some very desirable properties.
-
They are mutually orthogonal, which means that calculations should be more
accurate as round-off error is reduced.
-
They also oscillate very evenly, which usually results in decreasing
coefficients as the degree of the polynomial increases. With ordinary
polynomial fits, the coefficients often show wild oscillations, further
decreasing their accuracy.
Mathematically, curve fitting with ordinary polynomials and with Chebyshev
polynomials produce exactly the same result. In practice, however, the
Chebyshev method is clearly superior.
How To Use The Program
On startup, the program window shows a blank graph on the left and a tabbed
input/output panel on the right.

Clicking anywhere within the graph area selects a new data point, marked by a
black dot. You can select up to 20 data points. The Reset button clears all
data points.
The input panel lets you select which type of curve you want to fit to the data
points. The options are:
Value |
Description |
Constant |
A horizontal line. |
Line |
A line with arbitrary slope. |
Quadratic |
A quadratic curve of the form y = ax2 + bx + c |
Polynomial |
A polynomial of the specified degree. |
ChebyshevSeries
|
A combination of Chebyshev polynomials up to the specified degree. |
Custom |
A set of functions. |
When Polynomial or ChebyshevSeries is chosen, you must specify the degree of the
approximation. The default is 4. Note that both these options yield essentially
the same curve, but in a different representation.
When Custom is selected, a list of combo boxes appears that lets you choose up
to five functions from a list:

The functions available are: constant, identity (x), sin x,
cos x, exp x (ex), erfc x (the
complimentary error function), and J0(x) (the
ordinary Bessel function of the first kind of order 0).
Pressing the 'Fit' button calculates the fit. The best fit curve is drawn on the
graph area as a purple line. The coefficients are available from the Output
tab.
We pointed out earlier that polynomials and Chebyshev expansions produce the
same mathematical curve but Chebyshev expansions are more desirable
numerically. To illustrate this point, the table below lists the coefficients
of the 8th degree polynomial and Chebyshev expansions for a set of data points:
Degree
|
Polynomial coefficient
|
Chebyshev coefficient
|
0 |
+9.77626290245595 |
+3.71069905169146 |
1 |
+95.7239111394749 |
+1.92780480809127 |
2 |
-221.123267897382 |
-1.36134951909525 |
3 |
+233.06753396019 |
+1.9297471097274 |
4 |
-133.066620592025 |
-1.1546134638987 |
5 |
+44.1694446535232 |
+1.75613756606959 |
6 |
-8.5637324626004 |
-2.04416637823348 |
7 |
+0.902201184865039 |
+0.982129798914638 |
8 |
-0.0399611216231435 |
-0.476413261410861 |
As you can see, the polynomial coefficients take on large values, up to 100
times the actual size of the result. Inevitably, two or more digits of
precision will be lost in the calculation. By contrast, the coefficients of the
Chebyshev series are all of the same order. In addition, it turns out that it
is impossible to calculate higher degree polynomials due to round-off error.
Trying to fit the same data set to a 12th degree polynomial results in the
following error message:

The Chebyshev expansion poses no such problem up to and including the maximum
degree of 19.
The Sample Code
A complete code walkthrough would be beyond the purpose of this document. We
will cover the numerical aspects only.
Representation of data points
The data points are kept in two Double
arrays, x
and
y
When the user clicks in the chart area, the screen coordinates are
transformed to values in the range [0, 5]. The _numberOfDataPoints
keeps track of the total number of data points.
Encapsulating Functions
In computer science, the term function usually refers to a subroutine
that may take some arguments and returns a value. The concept was derived from
the mathematical function, which has the same meaning at its core. The
major difference between the two concepts is that in computer science, what
matters is to produce the result in the fastest and most economical way
possible, whereas in mathematics, the function's properties and relationships
to other functions are more important.
Centuries of study of the properties of mathematical functions have yielded many
derived and related properties that have no parallel in computer science.
Derivatives and integrals are perhaps the most common examples. Functions or
methods in the Common Language Runtime aren't suited to represent mathematical
functions together with their properties. Mathematical functions are more like
objects than they are methods of an object or class.
In the Extreme Mathematics Library for .NET, the
Curve
class and its descendants represent mathematical
functions. The name Curve was chosen over the more obvious Function
because the latter is a reserved word in Visual Basic .NET and other languages.
Curve classes come with methods to
evaluate the function, its
derivative and
integral, and to
find the roots or zeroes of a curve.
Curves also have parameters that determine the exact shape of the curve. For
instance, a line
has both a slope and a y intercept (the point where it crosses the y-axis). All
operations on lines can be expressed in terms of these two parameters, and any
set of parameters yields a different line.
The algorithmic aspect of evaluating mathematical functions is encapsulated in a
series of delegates with various signatures. For example, a
RealFunction
delegate essentially contains a reference to a
method that takes one real (Double) argument and returns a real number. When
you only need to evaluate a mathematical function, and don't need the rich
functionality that comes with the Curve class, use a
RealFunction
delegate.
Computing the Least Squares fit
Curve fitting is the process of finding the curve that best approximates a set
of points from within a set of curves which are linear combinations of a set of
basis functions. The
FunctionBasis
class represents such a set of basis
functions.
Computing the least squares fit is done in two steps:
-
A
FunctionBasis
of the appropriate type is created to respresent the set of basis functions.
-
The
LeastSquaresFit
method is called on the FunctionBasis
object. This creates a
LinearCombination
object, which represents a curve that is a
linear combination of the basis functions.
Specialized classes that inherit from
FunctionBasis
exist for polynomials, Chebyshev expansions,
and collections of arbitrary functions.
The PolynomialBasis
class represents a set of monomials, which form a basis for polynomials up to a
specified degree. Constants, lines, and quadratic curves are special cases of
polynomials of degree 0, 1, and 2, respectively. The global variable _degree
is set to this value in the SelectedIndexChanged
event handler of
the cboCurveType
combo box. For general polynomials, the degree is
fetched from the textbox and validated. This value is then passed on to the
constructor for the PolynomialBasis
class as follows:
_basis = new PolynomialBasis(_degree);
The ChebyshevBasis
class represents a set of Chebyshev polynomials up to a specified degree.
Chebyshev polynomials only have their excellent properties over the interval
[-1, +1]. The domain of our fit runs from 0 to 5. Fortunately, the
ChebyshevBasis
class can rescale the Chebyshev polynomials
so that they retain these properties over any chosen interval. We must pass the
interval to the constructor. The degree is once again fetched from the textbox
control.
_basis = new ChebyshevBasis(0, 5, _degree);
For custom basis functions, we first create an array of
RealFunction
delegates containing the basis functions. The
elements of the _customFunctions
array are set by the SelectedIndexChanged
handler of the combo boxes. This handler also keeps track of the _numberOfCustomFunctions
variable. To get the array of basis functions, we simply iterate through the _customFunctions
array and fill our basis with the elements that are not null. We then pass this
array on to the
GeneralFunctionBasis
constructor.
RealFunction[] basisFunctions =
new RealFunction[_numberOfCustomFunctions];
Int32 current = 0;
for(Int32 index = 0; index < 5; index++)
{
if (_customFunctions[index] != null)
basisFunctions[current++] = _customFunctions[index];
}
_basis = new GeneralFunctionBasis(basisFunctions);
The actual curve fitting is performed by the
LeastSquaresFit
method of the
FunctionBasis
object. The coefficients are retrieved through
the GetParameters
method of the resulting Curve
.
_fit = _basis.LeastSquaresFit(vx, vy);
listParameters.DataSource = _fit.GetParameters();
Copyright © 2003-2023, Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual Studio.NET, and the Optimized for Visual Studio logo
are registered trademarks of Microsoft Corporation.