Extreme Optimization > User's Guide > Statistics Library > Histograms

Extreme Optimization User's Guide

User's Guide

Up: Statistics Library Next: Random Numbers Previous: Testing Homogeneity of Variances Contents

Histograms

A histogram is a table used to tally the frequency of data. Each data value is mapped to a bin .  In the Extreme Optimization Numerical Libraries for .NET, one-dimensional histograms are implemented by the Histogram class.

Constructing histograms

The histogram class has six constructors. The first constructor takes one Double array as its first argument. This array contains the boundaries of the bins.

C# CopyCode imageCopy Code
double[] bounds = new double[] {50, 62, 74, 88, 100};
Histogram histogram1 = new Histogram(bounds);
Visual Basic CopyCode imageCopy Code
Dim bounds As Double() = New Double() {50, 62, 74, 88, 100}
Dim histogram1 As Histogram = New Histogram(bounds)

The second constructor takes one additional argument: a SpecialBins value that indicates which special values should be tabulated in addition to those defined by the bin boundaries. The possible values are as follows:

Name Description
None  No special bins are included.
BelowMinimum  There is a special bin for values below the scale's minimum value.
AboveMaximum  There is a special bin for values above the scale's maximum value.
OutOfRange There is a special bin for values that are outside the scale's range.
Missing There is a special bin for missing values.
Table 1. Values of the SpecialBins enumeration.

If the BelowMinimum bin is included, this bin is the first bin in the collection. If the AboveMaximum bin is included, it is the last bin in the collection. The following creates a histogram with the same boundaries as above, but with an extra bin to hold values less than 50:

C# CopyCode imageCopy Code
double[] bounds = new double[] {50, 62, 74, 88, 100};
Histogram histogram2 = new Histogram(bounds, SpecialBins.BelowMinimum);
Visual Basic CopyCode imageCopy Code
Dim bounds As Double() = New Double() {50, 62, 74, 88, 100}
Dim histogram2 As Histogram = New Histogram(bounds, SpecialBins.BelowMinimum)

The third constructor takes three arguments. The first two are the lower bound of the lowest bin, and the upper bound of the highest bin. The third argument is the total number of bins. This creates a histogram with the specified number of bins that are all equal in width. The fourth constructor has one additional argument: a SpecialBins value that indicates which special values should be tabulated in addition to those within the specified interval.

The code below creates a histogram with five bins for values between 50 and 100:

C# CopyCode imageCopy Code
Histogram histogram3 = new Histogram(50, 100, 5);
Visual Basic CopyCode imageCopy Code
Dim histogram3 As Histogram = New Histogram(50, 100, 5)

The fifth constructor takes two integer arguments: the lower and upper bounds. This constructs a histogram whose bin boundaries are the integers from the lower bound up to the upper bound.

The sixth constructor takes one argument: a NumericalScale object that specifies how the values are to be tabulated. The following example creates the same histogram as above:

C# CopyCode imageCopy Code
NumericalScale scale = new NumericalScale(50, 100, 5);
Histogram histogram3 = new Histogram(scale);
Visual Basic CopyCode imageCopy Code
Dim scale As NumericalScale = New NumericalScale(50, 100, 5)
Dim histogram4 As Histogram = New Histogram(scale)

Tabulating Data

There are three ways to set the totals for the bins in a histogram.

The first way is to use the Increment method. This method takes one or two arguments. The first argument is the number to tabulate. The second parameter is an optional weight. If no weight is specified, it is assumed to be 1. This method increments the total of the bin that contains the first argument by 1 or the weight from the second argument.

C# CopyCode imageCopy Code
histogram1.Increment(83);
histogram1.Increment(78, 2.5);
Visual Basic CopyCode imageCopy Code
histogram1.Increment(83)
histogram1.Increment(78, 2.5)

The second way is to use the Tabulate method. This method tabulates the data specified in its first argument. This can be either a Double array or a NumericalVariable. An optional second argument specifies the weight for each data value. This argument is of the same type as the first argument.

C# CopyCode imageCopy Code
// Tabulate an array:
double[] data = new double[]
    {62, 77, 61, 94, 75, 82, 86, 83, 64, 84, 
     68, 82, 72, 71, 85, 66, 61, 79, 81, 73};    
histogram2.Tabulate(data);
// Tabulate a numerical variable:
NumericalVariable variable = new NumericalVariable("data", data);
histogram2.Tabulate(variable);
Visual Basic CopyCode imageCopy Code
' Tabulate an array:
Dim data As Double() = New Double() _
    {62, 77, 61, 94, 75, 82, 86, 83, 64, 84, _
     68, 82, 72, 71, 85, 66, 61, 79, 81, 73}
histogram2.Tabulate(data)
' Tabulate a numerical variable:
Dim variable As NumericalVariable = New NumericalVariable("data", data)
histogram2.Tabulate(variable);

Finally, you can set the value of all bins directly using the SetTotals method. This method takes a Double array as its only argument. The length of this array must be equal to the number of bins. It sets the total of each bin to the corresponding value in the array.

The AddTotals method is similar, but adds the totals specified by the argument to the bin totals.

C# CopyCode imageCopy Code
double[] totals = new double[] {2, 7, 9, 8, 1};
histogram1.SetTotals(totals);
histogram2.AddTotals(totals);
Visual Basic CopyCode imageCopy Code
Dim totals As Double() = New Double() {2, 7, 9, 8, 1}
histogram1.SetValues(totals)
histogram2.AddValues(totals)

To set all totals to zero, use the Clear method.

Histogram Bins

Individual bins are represented by HistogramBin objects. Histogram bins have a LowerBound and an UpperBound property. Together, these define the interval that is covered by the bin. The Width property returns the total width of the bin. Note that this may be infinite. The Value property returns the total for the bin. All these properties are read-only.

Histogram bins can't be created independently. They are maintained by the Histogram object. They can be accessed through the histogram's Bins property. This property returns a HistogramBinCollection object that can be used to access individual bins. You can access a bin using the indexed Item property. In C#, this property is the indexer property for the bin collection:

You can use for-each to iterate through a histogram's bins:

C# CopyCode imageCopy Code
foreach(HistogramBin bin2 in histogram1.Bins)
    Console.WriteLine("{0}-{1}: total = {2}", 
        bin.LowerBound, bin.UpperBound, bin.Value);
        
Visual Basic CopyCode imageCopy Code
For Each bin2 As HistogramBin In histogram1.Bins
    Console.WriteLine("{0}-{1}: total = {2}", _
        bin.LowerBound, bin.UpperBound, bin.Value)
Next

You can find the bin corresponding to a specific value through the Find method. This returns the HistogramBin object corresponding to its argument.

Other Properties and Methods

The TotalValue property returns the sum of all totals in all bins. The GetTotals method returns a Double array containing the totals for each bin.

The GetGoodnessOfFitTest method returns a ChiSquareGoodnessOfFitTest object that can be used to verify the hypothesis that the data in the histogram follows a certain distribution. The method takes two parameters. The first is a ContinuousDistribution object that specifies the distribution to be tested against. The second is an integer that specifies the number of parameters of the distribution that were estimated. Any estimated parameter reduces the degrees of freedom by one.

Up: Statistics Library Next: Random Numbers Previous: Testing Homogeneity of Variances Contents

Overview
Introduction
Features
Documentation
QuickStart Samples
Sample Applications
Downloads
Get it now!
Download trial version
How to Buy
Information
Resources
Contact Us
Search

"The Extreme Optimization Statistics Library for .NET is a major boon for those doing statistical work in .NET. I strongly recommend this product."
- Marc Brooks

"I have made it my mission to institutionalize the value of good API design.  I strongly believe that this is key to making developers more productive and happy on our platform. It is clear that you value good API design in your work, and take to heart developer productivity and synergy with the .NET framework."
- Brad Abrams,
Lead Program Manager, Microsoft.

This is a partial list of companies who are using our libraries:
ABB Robotics
Allstate
Applied Materials
Arcam
Astra Schedule
Babson College
Canadian Council on Learning
Canyon Associates
Caxton Associates
CECity
Constellation Energy
CreditSights
DeepOcean
Duke University
Dynamotive
Elecsoft
Engelhard Corporation
Epcor
Equipoise Software
Galileo International
GAM UK
Gammex
GlaxoSmithKline
Global Matrix
The Hartford
Infinera Corporation
Intel
JDS Uniphase
LaBranche & Co.
Learning & Skills Council
Jacobs Consultancy
Litman Gregory
Lucas Systems
Malvern Instruments
Medrio
Merck & Co.
Mintera.
Monitor Software
MorningStar
NanoString Technologies
Paletta Invent
Parametric Portfolio Associates
Prosanos
RATA Associates
RiskShield
Ramboll
Standard & Poor's
Strategic Analysis Corporation
Univ. of Alicante
Univ. of South Carolina
vielife
Xerox
US Army