Groupings | Extreme Optimization Numerical Libraries for .NET Professional |

A grouping is a collection of labeled groups of elements. It consists of an index of group keys and, for each key in the index, a set of integer indices that specify membership of the group. Groupings can be used to aggregate and to summarize data, to reshape data, and to perform certain calculations like moving averages.

Groupings are not tied to the data they may have been derived from.

As with other collection classes, there are two types that represent
groupings. The generic
Grouping

A grouping object has an Index
property that returns the collection of group keys as a strongly typed index.
When accessed through the IGrouping
interface, the property returns an untyped index.
The Count
property returns the number of groups in the grouping.
The GetIndexes(Int32)
method returns the sequence of indexes for the group at the specified position.
The GetCounts

Groupings are created using

A partition is a grouping where each key is part of at most one group. There are several ways to create a partition:

The Partition method partitions a list into groups of equal size. This method has two to four arguments. The first argument is a list of key values. The second argument is the size of each partition. The third and fourth arguments are optional.

The third argument is a boolean value that specifies whether the partitions should be
aligned to the end of the list. When omitted or set to

The fourth argument is used in conjunction with the third, and
specifies whether incomplete partitions should be skipped.
When omitted or set to

The following example creates a partition of a list of dates. Each partition has 10 elements. Only full partitions are returned. The last partition ends on the last date in the list:

var partition = Grouping.Partition(dates, 10, alignToEnd: true, skipIncomplete: true); var partitionAvg = x.AggregateBy(partition, Aggregators.Mean);

The VariablePartition

The ByValue
method creates a grouping based on the value in a vector. This corresponds to
group by clauses in database queries.
The method is overloaded. The first argument is always a list that contains the values
to group on. An optional second argument specifies a
IEqualityComparer

Partitions may also be created by quantile. The ByQuantile method creates a partition based on the order of the values. The method has two overloads. The first argument is a list that contains the values to group on. The second argument is either an integer or a list of real numbers. If it is an integer, it specifies the number of partitions or groups. Each partition will contain roughly the same number of elements. For example, with two partitions, the first partition will contain all indexes whose value in the list is less than the median, while the second partition will contain all indexes whose value is greater than the median. If the second argument is a list of real numbers, they specify the quantiles to include. The number of elements in each partition is proportional to the fraction specified by successive quantiles.

In the example below, we again use the Titanic dataset to group passengers into 5 groups of roughly equal size based on age. The 20% youngest passengers will be in the first group, the next 20% in the second group, and so on:

A window grouping consists of overlapping segments of a list. There are several ways to define window groupings.

The Window(Int32, Int32, Int32, Boolean, Int32) method creates moving windows of fixed length. It takes 2 to 5 arguments. The first argument is a list of keys. The second argument is the size of the window.

The remaining arguments are optional. The third argument specifies
the offset of the key in the window. A negative value means the offset is counted
from the end of the window. The default value is -1,
which means the last key in each window is used as the group key.
The fourth argument specifies whether partial windows should be included in the grouping.
When

The example below creates a moving window of length 20 and computes the corresponding moving average:

var window = Grouping.Window(dates, 20); var ma20 = x.AggregateBy(window, Aggregators.Mean);

Fixed size moving windows can also be created without an index. In this case, the first argument is the length of the source data. The remaining arguments have the same meaning.

The RangeWindow method creates moving windows whose range (difference between largest and smallest value) is not greater than the specified value. It takes 3 arguments. The first argument is a list of keys. The second argument is the width of the window. The last argument is the direction the window moves in. For the forward direction, each element from first to last is taken as the first element of the group, and the window is expanded until the width exceeds the specified width.

A moving range window is a special case of a variable size window, created by the
VariableWindow

Finally, the ExpandingWindow

var expanding = Grouping.ExpandingWindow(dates); var expAvg = x.AggregateBy(expanding, Aggregators.Mean);

When one index is entirely contained in another, the keys in the larger index can be grouped according to the key in the smaller index that follows or precedes them. Such a grouping is commonly used to convert data sets based on different time frequencies. The Resample method creates such a grouping. It takes three arguments. The first argument is the original (larger) index. The second argument is the new (smaller) index. The third argument is a Direction value that indicates whether the entries in the new index should be taken as the start (Forward) or end (Backward) of a sampling interval. The new index is also the index of the grouping.

In the first example below, we create an index of dates on the 10th of each month. We then compute a resampling of the original dates using this index:

var months = Index.CreateDateRange(new DateTime(2015, 1, 10), 10, Recurrence.Monthly); var resampling1 = Grouping.Resample(dates, months, Direction.Backward);

In the second example, we create an index of dates on the 10th of each month. We then compute a resampling of the original dates using this index:

var months = Index.CreateDateRange(new DateTime(2015, 1, 10), 10, Recurrence.Monthly); var resampling1 = Grouping.Resample(dates, months, Direction.Backward);

A pivot is a two-dimensional grouping.
The Pivot

Pivots are created using the
Pivot

var survived = Grouping.Pivot( titanic["PClass"].As<int>(), titanic["Survived"].As<bool>()).CountsMatrix(); survived.UnscaleRowsInPlace(survived.GetRowSums()); Console.WriteLine(survived);

Copyright Â© 2004-20116,
Extreme Optimization. All rights reserved.

*Extreme Optimization,* *Complexity made simple*, *M#*, and *M
Sharp* are trademarks of ExoAnalytics Inc.

*Microsoft*, *Visual C#, Visual Basic, Visual Studio*, *Visual
Studio.NET*, and the *Optimized for Visual Studio* logo

are
registered trademarks of Microsoft Corporation.