Extreme Optimization™: Complexity made simple.

Math and Statistics
Libraries for .NET

  • Home
  • Features
    • Math Library
    • Vector and Matrix Library
    • Statistics Library
    • Performance
    • Usability
  • Documentation
    • Introduction
    • Math Library User's Guide
    • Vector and Matrix Library User's Guide
    • Data Analysis Library User's Guide
    • Statistics Library User's Guide
    • Reference
  • Resources
    • Downloads
    • QuickStart Samples
    • Sample Applications
    • Frequently Asked Questions
    • Technical Support
  • Order
  • Company
    • About us
    • Testimonials
    • Customers
    • Press Releases
    • Careers
    • Partners
    • Contact us
Introduction
Deployment Guide
Nuget packages
Configuration
Using Parallelism
Expand Mathematics Library User's GuideMathematics Library User's Guide
Expand Vector and Matrix Library User's GuideVector and Matrix Library User's Guide
Expand Data Analysis Library User's GuideData Analysis Library User's Guide
Expand Statistics Library User's GuideStatistics Library User's Guide
Expand Data Access Library User's GuideData Access Library User's Guide
Expand ReferenceReference

Skip Navigation LinksHome»Documentation»Data Analysis Library User's Guide»Working with Categorical Data»Binning and Discretization

Binning and Discretization

Extreme Optimization Numerical Libraries for .NET Professional

It is often necessary to group numerical data into categories. The range of the data is divided into a number of intervals, where each interval becomes a category in a numerical scale. This type of numerical scale is implemented by the IntervalIndexT class. This class inherits from IndexT, but provides some additional functionality.

Interval Scales

Constructing Interval Scales

The IntervalIndexT class takes one generic type argument: the type of the bounds of the intervals. This must be a type that implements the IComparableT interface. It has four constructors. They come in two pairs, each pair offering one way of defining the intervals that make up the scale. Each constructor also corresponds to an overload of the static CreateBins method of the Index class.

The first constructor takes one argument: a Double array that contains the boundaries of the intervals. The values in this array must be in ascending order, or an ArgumentException will be thrown.

C#
VB
C++
F#
Copy
double[] bounds = new double[] { 50, 62, 74, 88, 100 };
var scale1 = new IntervalIndex<double>(bounds);
var scale1a = Index.CreateBins(bounds);
Dim bounds = New Double() {50, 62, 74, 88, 100}
Dim scale1 = New IntervalIndex(Of Double)(bounds)
Dim scale1a = Index.CreateBins(bounds)

No code example is currently available or this language may not be supported.

let bounds = [| 50.0; 62.0; 74.0; 88.0; 100.0 |]
let scale1 = IntervalIndex<float>(bounds)
let scale1a = Index.CreateBins(bounds)

The second constructor takes one additional argument: a SpecialBins value that specifies which special intervals to include in the scale, if any. The possible values are as follows:

Values of the SpecialBins enumeration.

Name

Description

None

No special intervals are included.

BelowMinimum

There is a special interval for values below the minimum value.

AboveMaximum

There is a special interval for values above the maximum value.

OutOfRange

There is a special interval for values that are outside the scale's range.

Missing

There is a special interval for missing values.

If BelowMinimum is included, an interval with as lower bound the smallest possible value for the element type is inserted before all other intervals. If AboveMaximum is included, an interval with as upper bound the largest possible value is added at the end. The following creates an interval index with the same boundaries as above, but with an extra interval to hold values less than 50:

C#
VB
C++
F#
Copy
var scale2 = new IntervalIndex<double>(bounds, SpecialBins.BelowMinimum);
var scale2a = Index.CreateBins(bounds, SpecialBins.BelowMinimum);
Dim scale2 = New IntervalIndex(Of Double)(bounds, SpecialBins.BelowMinimum)
Dim scale2a = Index.CreateBins(bounds, SpecialBins.BelowMinimum)

No code example is currently available or this language may not be supported.

let scale2 = new IntervalIndex<float>(bounds, SpecialBins.BelowMinimum)
let scale2a = Index.CreateBins(bounds, SpecialBins.BelowMinimum)

The third constructor takes three arguments. The first two are the lower bound of the first interval, and the upper bound of the last interval. The third argument is the total number of intervals. This creates a scale with the specified number of intervals that are all equal in width. The fourth constructor has one additional argument: a SpecialBins value that indicates which special values should be tabulated in addition to those within the specified interval. The code below creates a scale with five intervals for values between 50 and 100:

C#
VB
C++
F#
Copy
var scale3 = new IntervalIndex<double>(50.0, 100.0, 5);
var scale3a = Index.CreateBins(50.0, 100.0, 5);
Dim scale3 = New IntervalIndex(Of Double)(50.0, 100.0, 5)
Dim scale3a = Index.CreateBins(50.0, 100.0, 5)

No code example is currently available or this language may not be supported.

let scale3 = new IntervalIndex<float>(50.0, 100.0, 5)
let scale3a = Index.CreateBins(50.0, 100.0, 5)
Mapping Values to Intervals

The Lookup method has a couple of additional overloads in addition to the ones defined for standard IndexT objects. These methods map a value to the index of the interval that contains it. There are two overloads: one that takes a single value and returns the integer index (or -1 if no interval contains the value), and one that takes a list of values and returns an array of indexes:

C#
VB
C++
F#
Copy
Console.WriteLine(scale3.Lookup(63.5)); // 1
double[] values = { 71.3, 39.5, 66.7, 90.4, 62.1 };
Console.WriteLine(scale3.Lookup(values)); // { 2, -1, 1, 4, 1 }
Console.WriteLine(scale3.Lookup(63.5)) ' 1
Dim values = {71.3, 39.5, 66.7, 90.4, 62.1}
Console.WriteLine(scale3.Lookup(values)) '  2, -1, 1, 4, 1 }

No code example is currently available or this language may not be supported.

Console.WriteLine(scale3.Lookup(63.5)) // 1
let values = [| 71.3; 39.5; 66.7; 90.4; 62.1 |]
Console.WriteLine(scale3.Lookup(values)) //  2, -1, 1, 4, 1 }
Binning vectors

Once an interval index has been defined, it can be used to map a vector of values to a vector of categories. The VectorBin performs this operation. This is defined as an extension method, so it can be called on the vector directly. It has multiple overloads that can work on both typed and untyped vectors. In its simplest form, it takes two arguments: a vector and an interval index. This creates a categorical vector of intervals.

C#
VB
C++
F#
Copy
var v = Vector.CreateRandom(100);
var bins = Index.CreateBins(0.0, 1.0, 10);
var vBinned1 = v.Bin(bins);
var bounds = Vector.Create(9, i => (i+1) / 10.0);
var vBinned2 = v.Bin(bounds, SpecialBins.BelowMinimum | SpecialBins.AboveMaximum);
var vBinned3 = v.Bin(10);
Dim v = Vector.CreateRandom(100)
Dim bins = Index.CreateBins(0.0, 1.0, 10)
Dim vBinned1 = v.Bin(bins)
Dim bounds = Vector.Create(9, Function(i) (i + 1) / 10.0)
Dim vBinned2 = v.Bin(bounds, SpecialBins.BelowMinimum Or SpecialBins.AboveMaximum)
Dim vBinned3 = v.Bin(10)

No code example is currently available or this language may not be supported.

let v = Vector.CreateRandom(100)
let bins = Index.CreateBins(0.0, 1.0, 10)
let vBinned1 = v.Bin(bins)
let bounds = Vector.Create(9, fun i -> (float i + 1.0) / 10.0)
let vBinned2 = v.Bin(bounds, SpecialBins.BelowMinimum ||| SpecialBins.AboveMaximum)
let vBinned3 = v.Bin(10)

Copyright (c) 2004-2023 ExoAnalytics Inc.

Send comments on this topic to support@extremeoptimization.com

Copyright © 2004-2023, Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual Studio.NET, and the Optimized for Visual Studio logo
are registered trademarks of Microsoft Corporation.