A variable is a collection of observations of a characteristic
of an object that can take on two or more values. This chapter
provides an overview of how variables are implemented in the
Extreme Optimization Numerical Libraries for .NET.
Variables occur in two situations, either on their own or as part
of a collection. On their own, you can use them to calculate descriptive
statistics, like the mean and the standard deviation. Or you can use it
to perform statistical tests, such as the one-sample z test
or the Kolmogorov-Smirnov goodness-of-fit test.
Most often, however, variables will come in groups and represent
different properties or measurements in a data set.
In the Extreme Optimization Numerical Libraries for .NET,
variables are implemented as
VectorT
objects.
Variables can be either continuous, or categorical.
Any variable that can take on a value from a continuous range is called
a continuous variable. Any numerical or date type can be used
as the element type.
Nothing needs to be done to mark a vector as a continuous variable.
Variables whose observations can take on only one of a finite set
of values are called categorical variables or discrete variables.
They are implemented by the
CategoricalVectorT
class.
Fundamental to the implementation of categorical variables is
the category index. The category index represents the possible values
that a variable can have. Every categorical variable has an
associated category index. The index is used to map an object
to its category, or to the index of its category in a list of categories.
The category index is an
IndexT object.
Most statistical data sets are made up of several variables.
This functionality is encapsulated in a
DataFrame
object. Data frames can be created by adding individual variables,
and by importing them from files or data bases.
Other Resources