Categorical Vectors | Extreme Optimization Numerical Libraries for .NET Professional |

Categorical vectors are vectors whose elements are taken from a limited set of values or levels. The set of possible values is called the category index. The elements are stored as integer indexes (level indexes) into the set of possible values. This makes it possible to have missing values where the element type does not have a representation of a missing value.

Categorical vectors are implemented by the
CategoricalVector

Categorical vectors also implement the IGrouping interface, which means they can be used directly as grouping objects in aggregation operations.

The CategoricalVector

The first overload takes one argument: the length of the vector. This creates a categorical vector where all values are missing. The element type must be specified as the generic type argument.

The second overload takes a list of values. The category index and level indexes are inferred from the values in the array. An optional second argument specifies the mutability of the new vector. The third overload has 2 or 3 arguments. The first is once again a list of values. The second argument is the category index. The optional third argument specifies the mutability. When a value is not found in the supplied category index, the corresponding entry in the result is marked as missing. In the example below, we create two categorical variables with the same values. Although the values are the same, the level indexes are different because the category indexes are different:

The fourth overload also takes 2 or 3 arguments. The first argument is the category index. The second argument is a list of category indexes. The optional third argument specifies the mutability. We can create the same vector again with the following code:

In addition, any vector can be converted to a categorical vector
by calling its
AsCategorical
method. If the vector is already categorical, then the same vector is returned.
Optionally, an Index

The CategoricalVector

The CategoryIndex
property returns an Index

The GetIndexes method returns a sequence of indexes of the elements that have a specific value. You can supply the actual value to look up, or the level index. The code below illustrates all these properties and methods:

var categories = c2.CategoryIndex; // { "a", "b", "d" } var levels = c2.LevelIndexes; // [ 0, 1, 0, 1, 2 ] var at3 = c2.GetLevelIndex(3); // 1 var indexesB = c2.GetIndexes("b").ToArray(); // [ 1, 3 ] var indexesAt1 = c2.GetIndexes(1).ToArray(); // [ 1, 3 ]

A categorical vector is essentially a mapping from integer indexes to
values contained in the category index. The
WithCategories

var newIndex = Index.Create(new[] { 'A', 'B', 'D' }); var C2 = c2.WithCategories(newIndex); // [ 'A', 'B', 'A', 'B', 'D' ] var counts = c2.GetCounts();

Copyright Â© 2004-20116,
Extreme Optimization. All rights reserved.

*Extreme Optimization,* *Complexity made simple*, *M#*, and *M
Sharp* are trademarks of ExoAnalytics Inc.

*Microsoft*, *Visual C#, Visual Basic, Visual Studio*, *Visual
Studio.NET*, and the *Optimized for Visual Studio* logo

are
registered trademarks of Microsoft Corporation.