Extreme Optimization™: Complexity made simple.

Math and Statistics
Libraries for .NET

  • Home
  • Features
    • Math Library
    • Vector and Matrix Library
    • Statistics Library
    • Performance
    • Usability
  • Documentation
    • Introduction
    • Math Library User's Guide
    • Vector and Matrix Library User's Guide
    • Data Analysis Library User's Guide
    • Statistics Library User's Guide
    • Reference
  • Resources
    • Downloads
    • QuickStart Samples
    • Sample Applications
    • Frequently Asked Questions
    • Technical Support
  • Order
  • Company
    • About us
    • Testimonials
    • Customers
    • Press Releases
    • Careers
    • Partners
    • Contact us
Introduction
Deployment Guide
Nuget packages
Configuration
Using Parallelism
Expand Mathematics Library User's GuideMathematics Library User's Guide
Expand Vector and Matrix Library User's GuideVector and Matrix Library User's Guide
Expand Data Analysis Library User's GuideData Analysis Library User's Guide
Expand Statistics Library User's GuideStatistics Library User's Guide
Expand Data Access Library User's GuideData Access Library User's Guide
Expand ReferenceReference

Skip Navigation LinksHome»Documentation»Data Analysis Library User's Guide»Grouping and Aggregation»Aggregating data frames

Aggregating data frames

Extreme Optimization Numerical Libraries for .NET Professional

Aggregation operations on data frames can be performed on the data frame as a whole, or on grouped data.

Aggregating full data frames

The AggregateT(AggregatorGroupT) method and its overloads compute aggregates of all the data in a data frame. This method has 5 overloads, some of which are defined as extension methods. The first method takes a single argument: a AggregatorGroupT that specifies the aggregator that is to be applied to each column. It returns a VectorT that contains the result of applying the aggregator to each column. If the aggregator does not support the element type of a column, a missing value is returned. The following example computes the mean of all numerical columns in the Titanic dataset:

C#
VB
C++
F#
Copy
var titanic = DataFrame.ReadCsv(titanicFilename);
var means = titanic.Aggregate(Aggregators.Mean);
Dim titanic = DataFrame.ReadCsv(titanicFilename)
Dim means = titanic.Aggregate(Aggregators.Mean)

No code example is currently available or this language may not be supported.

let titanic = DataFrame.ReadCsv(titanicFilename)
let means = titanic.Aggregate(Aggregators.Mean)

The second overload takes a parameter array of aggregator groups. It returns a data frame with a row for each aggregator in the array. All aggregator groups must return the same type. As an example, we add the count and the standard deviation to the previous aggregation:

C#
VB
C++
F#
Copy
var descriptives = titanic.Aggregate(
    Aggregators.Count, 
    Aggregators.Mean, 
    Aggregators.StandardDeviation);
Dim descriptives = titanic.Aggregate(
Aggregators.Count,
Aggregators.Mean,
Aggregators.StandardDeviation)

No code example is currently available or this language may not be supported.

let descriptives = titanic.Aggregate(
                    Aggregators.Count, 
                    Aggregators.Mean, 
                    Aggregators.StandardDeviation)
Aggregating grouped data frames

The AggregateByR1(IGrouping, AggregatorGroup) method and its overloads compute aggregates of the data in a data frame grouped according to some criteria. The method has many overloads. The first argument always specifies the grouping. It can be a GroupingTKey object, a vector, or the key of the column that is to be used for the grouping. When using the column key, the element type of the column must be specified as the generic type argument. The remaining arguments follow the same pattern as the AggregateT(AggregatorGroupT) method.

In the example below, we compute the mean of each column in the Titanic dataset grouped by the passenger class. The result is a data frame with one row for each class indexed by the class number. We show all three methods of specifying the grouping.

C#
VB
C++
F#
Copy
var key = "Pclass";
var vector = titanic[key].As<int>();
var grouping = Grouping.ByValue(vector);
var meanByClass1 = titanic.AggregateBy(grouping, Aggregators.Mean);
var meanByClass2 = titanic.AggregateBy(vector, Aggregators.Mean);
var meanByClass3 = titanic.AggregateBy<int>(key, Aggregators.Mean);
Dim key = "Pclass"
Dim vector = titanic(key).As(Of Integer)()
Dim groups = Grouping.ByValue(vector)
Dim meanByClass1 = titanic.AggregateBy(groups, Aggregators.Mean)
Dim meanByClass2 = titanic.AggregateBy(vector, Aggregators.Mean)
Dim meanByClass3 = titanic.AggregateBy(Of Integer)(key, Aggregators.Mean)

No code example is currently available or this language may not be supported.

let key = "Pclass"
let vector = titanic.[key].As<int>()
let grouping = Grouping.ByValue(vector)
let meanByClass1 = titanic.AggregateBy(grouping, Aggregators.Mean)
let meanByClass2 = titanic.AggregateBy(vector, Aggregators.Mean)
let meanByClass3 = titanic.AggregateBy<int>(key, Aggregators.Mean)

When multiple aggregators are supplied, the resulting data frame has a hierarchical column index. The first level contains the original column keys. The second level contains the name of the aggregator. In the next example, we compute the count, the mean, and the median of each column:

C#
VB
C++
F#
Copy
var manyByClass1 = titanic.AggregateBy(grouping, 
    Aggregators.Count, Aggregators.Mean, Aggregators.Median);
Dim manyByClass1 = titanic.AggregateBy(groups,
    Aggregators.Count, Aggregators.Mean, Aggregators.Median)

No code example is currently available or this language may not be supported.

let manyByClass1 = titanic.AggregateBy(grouping, 
                    Aggregators.Count, Aggregators.Mean, Aggregators.Median)

Copyright (c) 2004-2023 ExoAnalytics Inc.

Send comments on this topic to support@extremeoptimization.com

Copyright © 2004-2023, Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual Studio.NET, and the Optimized for Visual Studio logo
are registered trademarks of Microsoft Corporation.