Extreme Optimization >
User's Guide >
Statistics Library >
Continuous Variables >
Transforming Numerical Variables
Extreme Optimization User's Guide
User's Guide
Up: Continuous Variables Next: Time Variables Previous: Numerical Variables Contents
Transforming Numerical Variables
In many situations, it is useful to apply some kind of
transformation to a numerical variable. To avoid cluttering the
members of the NumericalVariable
class with these methods, they are made available as methods of the
numerical variable's Transforms
property.
Transformations can be subdivided in the following
categories:
- Arithmetic operations.
- Elementary functions
- Simple transformations.
- Indicators of change.
- Extrapolated indicators of change.
- Moving averages.
- Other moving summary statistics
- Partial sums and differences.
- Miscellaneous transformations.
Arithmetic operations have been discussed in the previous
section. They are available as overloaded operators or static
(Shared in Visual Basic) operator methods on the
NumericalVariable class. The remaining transformations
are available through the Transforms property. Each
category will now be described in greater detail.
Elementary functions
This category includes transformations that involve applying an
elementary function to each observation of a variable. The table
below lists the methods in this category:
| Member Name |
Description |
|
Abs |
Each observation is the absolute value of the original
observation. |
|
Exp |
Each observation is the exponential of the original
observation. |
|
Log |
Each observation is the natural logarithm of the original
observation. |
|
Sqrt |
Each observation is the square root of the original
observation. |
Table 1. Methods for elementary functions.
Simple transformations
This category includes transformations that involve arithmetic
operations and translations.
The GetLag method is overloaded. Without
parameters, it returns a variable whose observations are moved
ahead by one interval. Each new observation is the observation
before the current observation. The first observation is set to
NaN.
The second overload takes one parameter: the lag, or
number of observations to shift the series by. A positive value
indicates that the observations are shifted forward. If the lag is
equal to 1, then each new observation is the observation before the
current observation. If the lag is equal to -1, then each new
observation is the observation after the current observation. Any
observations that do not exist in the original variable are set to
NaN.
The third overload takes two parameters. The first parameter is
once again the lag. The second parameter specifies the value of the
observations that do not exist in the original variable.
The
GetCumulativeSum method returns a variable whose
observations are the cumulative sum of all observations up to the
current observation. The
GetCumulativeProduct method returns a variable whose
observations are the cumulative product of all observations up to
the current observation.
The following example creates a variable that contains the
observations of the previous period. It then creates a variable
that contains the cumulative sum of the variable.
| C# | Copy Code |
NumericalVariable previous = current.Transforms.GetLag(1);
NumericalVariable cumsum = current.Transforms.GetCumulativeSum(); |
| Visual Basic | Copy Code |
Dim previous As NumericalVariable = current.Transforms.GetLag(1)
Dim cumsum As NumericalVariable = current.Transforms.GetCumulativeSum() |
Indicators of change.
This set of transformations compares each current observation to
a past observation. The distance between the current observation
and its reference observation is called the lag. It is
passed to each of the methods as their only parameter. Its value
must be greater than zero.
The
GetChange method returns a variable where each
observation is the difference between the current observation and
the reference observation.
The
GetPercentChange method is similar. Each observation
is the percentage change of the current observation relative to the
reference observation. If the reference observation is zero, or if
the current observation and the reference observation have a
different sign, the new observation is NaN.
The
GetGrowthRate method returns a variable containing the
exponential growth rate. Each observation is the percentage change
of the current observation relative to the reference observation,
assuming the growth compounds continuously over time. If the
reference observation is zero, or if the current observation and
the reference observation have a different sign, the new
observation is NaN.
The first lag-1 observations of the new variable
are set to NaN.
The example below calculates the different indicators of change
for a 10 period lag:
| C# | Copy Code |
NumericalVariable change = current.Transforms.GetChange(10);
NumericalVariable pctChange = current.Transforms.GetPercentChange(10);
NumericalVariable growthRate = current.Transforms.GetGrowthRate(10); |
| Visual Basic | Copy Code |
Dim change As NumericalVariable = current.Transforms.GetChange(10)
Dim pctChange As NumericalVariable = current.Transforms.GetPercentChange(10)
Dim growthRate As NumericalVariable = current.Transforms.GetExponentialGrowthRate(10) |
Extrapolated indicators of change.
This set of transformations is similar to the previous one.
However, the observed change is extrapolated to a larger interval.
Once again, the lag is passed as the first parameter. A second
parameter, numberOfPeriods, indicates the relative
size of the extrapolation interval.
For example, if the current variable represents the price of a
certain commodity at the end of each month, then a value of 12 for
numberOfPeriods produces a variable that represents
the annualized change in price over each month.
The
GetExtrapolatedChange method returns a variable where
each observation is the extrapolated difference between the current
observation and the reference observation.
The
GetExtrapolatedPercentChange method is similar. Each
observation is the extrapolated percentage change of the current
observation relative to the reference observation. If the reference
observation is zero, or if the current observation and the
reference observation have a different sign, the new observation is
NaN.
The
GetExtrapolatedGrowthRate method returns a variable
containing the extrapolated exponential growth rate. Each
observation is the extrapolated percentage change of the current
observation relative to the reference observation, assuming the
growth compounds continuously over time. If the reference
observation is zero, or if the current observation and the
reference observation have a different sign, the new observation is
NaN.
Once again, the first lag-1 observations of the new
variable are set to NaN.
The example below calculates the different indicators of change
for a 10 period lag and extrapolates it to 360 period values:
| C# | Copy Code |
NumericalVariable change360 = current.Transforms.GetExtrapolatedChange(10, 360);
NumericalVariable pctChange360 = current.Transforms.GetExtrapolatedPercentChange(10, 360);
NumericalVariable growthRate360 = current.Transforms.GetExtrapolatedExponentialGrowthRate(10, 360); |
| Visual Basic | Copy Code |
Dim change360 As NumericalVariable = _
current.Transforms.GetExtrapolatedChange(10, 360)
Dim pctChange360 As NumericalVariable = _
current.Transforms.GetExtrapolatedPercentChange(10, 360)
Dim growthRate360 As NumericalVariable = _
current.Transforms.GetExtrapolatedGrowthRate(10, 360) |
Moving averages.
Moving averages are commonly used to smooth data, and to find
trends in time series.
The
GetMovingAverage method returns the simple moving
average. It takes one parameter: the number of observations to
average. Each new observation is the average of the n
observations up to and including the current observation.
The
GetExponentialMovingAverage method returns the
exponential moving average. Each new observation is a weighted
combination of the current observation and the previous
average.
The exponential moving average can be specified in two ways. You
can specify the period as an integer. Alternatively, you can
specify the smoothing constant. This is a real number between 0 and
1 that specifies the contribution of the current observation to the
current moving average.
The code below calculates three moving averages: a simple 20 day
moving average, a 20 day exponential moving average, and a 3 day
exponential moving average specified using the
smoothingconstant:
| C# | Copy Code |
NumericalVariable MA20 = current.Transforms.GetMovingAverage(20);
NumericalVariable EMA20 = current.Transforms.GetExponentialMovingAverage(20);
NumericalVariable EMA3 = current.Transforms.GetExponentialMovingAverage(2.0 / (1 + 3)); |
| Visual Basic | Copy Code |
Dim MA20 As NumericalVariable = current.Transforms.GetMovingAverage(20)
Dim EMA20 As NumericalVariable = current.Transforms.GetExponentialMovingAverage(20)
Dim EMA3 As NumericalVariable = current.Transforms.GetExponentialMovingAverage(0.5) |
The
GetWeightedMovingAverage method returns a weighted
moving average. Each new observation is the weighted sum of the
observations.
The weights for the weighted moving average can be supplied as a
Double array, or as a Vector.
The weights are used in reverse order. The weight with index zero
is the weight for the current observation. The weight with index
one is the weight for the previous observation.
An optional integer parameter specifies the index in the weight
vector that corresponds to the current observation. This allows you
to create centrally weighted averages. The default is zero.
The following example creates a weighted moving average of five
observations centered around the current observation:
| C# | Copy Code |
double[] weights = {1.0, 2.0, 3.0, 2.0, 1.0};
NumericalVariable WMA3 = current.Transforms.GetWeightedMovingAverage(weights, 2); |
| Visual Basic | Copy Code |
Dim weights As Double() = {1.0, 2.0, 3.0, 2.0, 1.0}
Dim WMA5 As NumericalVariable = _
current.Transforms.GetWeightedMovingAverage(weights, 2) |
Other moving summary statistics.
The methods in this group calculate some statistic of a moving
window of observations.
The
GetMovingMaximum method returns a variable whose
observations are the largest of the n observations up to and
including the current observation. The
GetMovingMinimum method returns a variable whose
observations are the smallest of the n observations up to
and including the current observation.
The
GetMovingStandardDeviation method calculates a moving
standard deviation. Each new observation is the standard deviation
of the n observations up to and including the current
observation. It takes two parameters. The first is an integer that
specifies the number of observations. The second parameter is a
NumericalVariable
that contains the simple moving average of the variable over the
same number of observations. The
GetMovingSum method calculates a moving sum of the the
n observations up to and including the current
observation.
The
GetMovingAverageAbsoluteDeviation method calculates
the average absolute deviation of the n observations up to
and including the current observation from the corresponding
current observation of another variable. The first parameter is the
number of observations. The second parameter is a NumericalVariable
that contains the means from which the deviation is to be
calculated.
Period-to-date values and differences.
There are two transformations in this group. The first
calculates cumulative sums of the original observations within a
series of intervals. The second is the inverse transformation of
the first. It calculates the difference between each observation
and the previous one, except for the first observation in each
interval.
The
GetPeriodToDateValues method calculates period-to-date
sums. Each observation is the cumulative sum of the observations in
the current interval.
A common use for this method is to create period-to-date sum of
a time series variable relative to a longer time frame. For
example, if the variable contains monthly earnings, you can use
these methods to calculate the earnings to date per quarter.
This method has two overloads. The first takes an integer array
whose elements specify the boundaries of the intervals. The
remaining two parameters are BoundaryIntervalBehavior
values that indicate how the first and last interval should be
handled. If startBehavior has a value of
Exclude, then new observations with index smaller than
the first index in indexes should be set to
NaN.
The second overload is useful for variables that are part of a
TimeSeriesCollection.
The first parameter is a DateTimeVariable
that specifies the time corresponding to each observation. It must
have the same length as the numerical variable. The second
parameter is another DateTimeVariable that indicates
the start time of each interval. The remaining two parameters are
BoundaryIntervalBehavior values, as before.
The
GetPeriodToDateDifferences method performs the reverse
operation. Each observation is the difference between the current
and the previous observation in the current interval, except when
it is the first observation in the current interval. In that case,
the new observation is the same as the original observation.
Miscellaneous transformations.
The
GetReferenceIndex method scales the observations to
make them comparable to a standard index value. The method has two
overloads. The first overload takes two parameter. The first is the
index of the observation that serves as a reference. The second
parameter is the base value of the index. The observations are
scaled so that the index value of the reference observation equals
to base value of the index.
The second overload takes three parameters. This method
calculates the reference index based on the sum of a range of
observations. It takes three parameters. The first is the index of
the first observation in the reference interval. The second
parameter is the index of the last observation in the reference
interval. The third parameter is the base value of the index. The
observations are scaled so that the sum of the index values in the
reference interval equals the base value of the index.
The
GetPositiveToNegativeRatio method calculates the ratio
of the positive values to the negative values over an interval. The
first parameter is the lenght of the interval. The second parameter
is a NumericalVariable
that serves as the reference variable. The method calculates the
ratio of the sum of observations within the specified period where
the corresponding observation of the reference variable is
positive, and the sum of observations where the corresponding
reference observation is negative. Observations where the
corresponding reference observation is zero are not included.
The
GetPositiveToNegativeIndex method performs a similar
calculation. However, the result is not returned as a ratio, but as
an index value between 0 and 100. It has the same parameters as the
GetPositiveToNegativeRatio method.
Finally, the
GetBoxCoxTransform returns the Box-Cox transform of
the variable for the specified parameter lambda, which must be
between 0 and 1. This transformation is often used to reduce the
effects of non-normality.
Up: Continuous Variables Next: Time Variables Previous: Numerical Variables Contents
Copyright 2004-2008,
Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M
Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual
Studio.NET, and the Visual Studio Logo are registered trademarks of Microsoft Corporation