Aggregation operations on data frames can be performed
on the data frame as a whole, or on grouped data.
Aggregating full data frames
The AggregateT(AggregatorGroupT)
method and its overloads compute aggregates of all the data in a data frame.
This method has 5 overloads, some of which are defined as extension methods.
The first method takes a single argument: a
AggregatorGroupT
that specifies the aggregator that is to be applied to each column.
It returns a VectorT
that contains the result of applying the aggregator to each column.
If the aggregator does not support the element type of a column,
a missing value is returned. The following example computes the mean of all numerical
columns in the Titanic dataset:
var titanic = DataFrame.ReadCsv(titanicFilename);
var means = titanic.Aggregate(Aggregators.Mean);
Dim titanic = DataFrame.ReadCsv(titanicFilename)
Dim means = titanic.Aggregate(Aggregators.Mean)
No code example is currently available or this language may not be supported.
let titanic = DataFrame.ReadCsv(titanicFilename)
let means = titanic.Aggregate(Aggregators.Mean)
The second overload takes a parameter array of aggregator groups.
It returns a data frame with a row for each aggregator in the array.
All aggregator groups must return the same type. As an example,
we add the count and the standard deviation to the previous aggregation:
var descriptives = titanic.Aggregate(
Aggregators.Count,
Aggregators.Mean,
Aggregators.StandardDeviation);
Dim descriptives = titanic.Aggregate(
Aggregators.Count,
Aggregators.Mean,
Aggregators.StandardDeviation)
No code example is currently available or this language may not be supported.
let descriptives = titanic.Aggregate(
Aggregators.Count,
Aggregators.Mean,
Aggregators.StandardDeviation)
Aggregating grouped data frames
The AggregateByR1(IGrouping, AggregatorGroup)
method and its overloads compute aggregates of the data in a data frame
grouped according to some criteria. The method has many overloads.
The first argument always specifies the grouping.
It can be a GroupingTKey
object, a vector, or the key of the column that is to be used for the grouping.
When using the column key, the element type of the column must be
specified as the generic type argument.
The remaining arguments follow the same pattern as the
AggregateT(AggregatorGroupT)
method.
In the example below, we compute the mean of each column in the Titanic
dataset grouped by the passenger class. The result is a data frame
with one row for each class indexed by the class number.
We show all three methods of specifying the grouping.
var key = "Pclass";
var vector = titanic[key].As<int>();
var grouping = Grouping.ByValue(vector);
var meanByClass1 = titanic.AggregateBy(grouping, Aggregators.Mean);
var meanByClass2 = titanic.AggregateBy(vector, Aggregators.Mean);
var meanByClass3 = titanic.AggregateBy<int>(key, Aggregators.Mean);
Dim key = "Pclass"
Dim vector = titanic(key).As(Of Integer)()
Dim groups = Grouping.ByValue(vector)
Dim meanByClass1 = titanic.AggregateBy(groups, Aggregators.Mean)
Dim meanByClass2 = titanic.AggregateBy(vector, Aggregators.Mean)
Dim meanByClass3 = titanic.AggregateBy(Of Integer)(key, Aggregators.Mean)
No code example is currently available or this language may not be supported.
let key = "Pclass"
let vector = titanic.[key].As<int>()
let grouping = Grouping.ByValue(vector)
let meanByClass1 = titanic.AggregateBy(grouping, Aggregators.Mean)
let meanByClass2 = titanic.AggregateBy(vector, Aggregators.Mean)
let meanByClass3 = titanic.AggregateBy<int>(key, Aggregators.Mean)
When multiple aggregators are supplied, the resulting data frame has a hierarchical
column index. The first level contains the original column keys.
The second level contains the name of the aggregator.
In the next example, we compute the count, the mean, and the median of each column:
var manyByClass1 = titanic.AggregateBy(grouping,
Aggregators.Count, Aggregators.Mean, Aggregators.Median);
Dim manyByClass1 = titanic.AggregateBy(groups,
Aggregators.Count, Aggregators.Mean, Aggregators.Median)
No code example is currently available or this language may not be supported.
let manyByClass1 = titanic.AggregateBy(grouping,
Aggregators.Count, Aggregators.Mean, Aggregators.Median)