Extreme Optimization > User's Guide > Statistics Library > Hypothesis Tests > Testing Homogeneity of Variances

Extreme Optimization User's Guide

User's Guide

Up: Hypothesis Tests Next: Histograms Previous: Testing Goodness-of-Fit Contents

Testing Homogeneity of Variances

One of the assumptions underlying Analysis of Variances is that the variances accross groups are identical. This property is called homogeneity of variances. It is often desirable to verify this assumption using an appropriate hypothesis test. The Extreme Optimization Numerical Libraries for .NET provides two such tests: Bartlett's test and Levene's test.

Bartlett's Test

Bartlett's test is a relatively fast test for homogeneity of variances. The test is based on the assumption that the samples are normally distributed. It is sensitive to violations of this assumption. In practical terms, this means that Bartlett's test cannot adequately distinguish between violation of homogeneity of variances and violation of the normality assumption.

The null hypothesis is always that the variances of all groups are equal. The alternative hypothesis is that at least one of the variances is different. Bartlett's test is always one-tailed, and uses a chi-square statistic.

Bartlett's test is implemented by the BartlettTest class. It has two constructors. The first constructor takes no arguments. The data and conditions for the test must be specified by setting properties of the BartlettTest object, and using the AddSample and AddSamples methods to specify samples. The second constructor takes an array of NumericalVariable objects, that contain the samples the test is to be applied to.

Example

We start with a collection of measurements of gear diameters from 10 batches. We want to verify that the variances of the diameters for the batches are equal. The data comes in two variables: a CategoricalVariable that contains the batch numbers, and a NumericalVariable that contains the corresponding measurements of the diameters. The first step is to create a CellArray. We can then use the collection's GetCellVariables method to return an array of variables with the measurements for each batch.

C# CopyCode imageCopy Code
CategoricalVariable batchVariable =    new CategoricalVariable("batch", new object[] {...});
NumericalVariable diameterVariable = 
    new NumericalVariable("diameter", new double[] {...});
CellArray cells = new CellArray(diameterVariable, batchVariable);
NumericalVariable[] variables = cells.GetCellVariables();
Visual Basic CopyCode imageCopy Code
Dim batchVariable As CategoricalVariable = _
    New CategoricalVariable("batch", New Object() {...})
Dim diameterVariable As NumericalVariable = _
    New NumericalVariable("diameter", New Double() {...})
Dim cells As CellArray = _
    New CellArray(diameterVariable, batchVariable)
Dim variables As NumericalVariable() = cells.GetCellVariables()

We can then create the BartlettTest object, and run the test:

C# CopyCode imageCopy Code
BartlettTest bartlett = new BartlettTest(variables);
Console.WriteLine("Test statistic: {0:F4}", bartlett.Statistic);
Console.WriteLine("P-value:        {0:F4}", bartlett.PValue);
Console.WriteLine("Reject null hypothesis? {0}", 
    bartlett.Reject() ? "yes" : "no");
Visual Basic CopyCode imageCopy Code
Dim bartlett As BartlettTest = New BartlettTest(variables)
Console.WriteLine("Test statistic: {0:F4}", bartlett.Statistic)
Console.WriteLine("P-value:        {0:F4}", bartlett.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
    IIf(bartlett.Reject(), "yes", "no"))

The value of the chi-square statistic is 20.7859 giving a p-value of 0.0136. As a result, the hypothesis that the variances are equal is rejected at the 0.05 level.

Once a BartlettTest object has been created, you can access other properties and methods common to all hypothesis test classes. For instance, to obtain the critical values for a significance level of 0.01 and 0.05, the code would be:

C# CopyCode imageCopy Code
Console.WriteLine("Critical value: {0:F4} at 95%", 
    bartlett.GetUpperCriticalValue(0.05));
Console.WriteLine("Critical value: {0:F4} at 99%", 
    bartlett.GetUpperCriticalValue(0.01));
Visual Basic CopyCode imageCopy Code
Console.WriteLine("Critical value: {0:F4} at 95%", _
    bartlett.GetUpperCriticalValue(0.05))
Console.WriteLine("Critical value: {0:F4} at 99%", _
    bartlett.GetUpperCriticalValue(0.01))

The values of the critical values (16.9190 at 0.05 and 21.6660 at 0.01) show that the null hypothesis will be rejected at the 0.05 level.

Levene's Test

Levene's test is a slower but more robust test for homogeneity of variances. Levene's test is much less influenced by departures from normality that Bartlett's test. For this reason, it is often the test of choice.

As with Bartlett's test, the null hypothesis is always that the variances of all groups are equal. The alternative hypothesis is that at least one of the variances is different. Levene's test is always one-tailed, and uses an F statistic.

Levene's test comes in three flavors, depending on the measure of location used in the calculation of the statistic. The options are enumerated by the LeveneTestLocationMeasure enumeration:

Value Description
Median The median of the data is used as the location measure. This works best for normal data.
Mean The mean of the data is used as the location measure. This gives better results when the data is skewed.
TrimmedMean  The 10% trimmed mean is used as the location measure. This gives better results when the data is heavy-tailed.

If no value is specified, the median is used.

Levene's test is implemented by the LeveneTest class. It has three constructors. The first constructor takes no arguments. The data and conditions for the test must be specified by setting properties of the LeveneTest object, and using the AddSample and AddSamples methods to specify samples. The second constructor takes an array of NumericalVariable objects, that contain the samples the test is to be applied to. The last constructor takes one additional parameter: a LeveneTestLocationMeasure value that specifies which measure of location to use in the calculation of the test statistc. This value can also be accessed and set through the LocationMeasure property.

Example

We start from the same data as before: a collection of measurements of gear diameters from 10 batches. We want to verify that the variances of the diameters for the batches are equal. See the example with Bartlett's test for an illustration of how to prepare the data.

Here, we show how to create the LeveneTest object, and run the test:

C# CopyCode imageCopy Code
LeveneTest levene = new LeveneTest(variables);
Console.WriteLine("Test statistic: {0:F4}", levene.Statistic);
Console.WriteLine("P-value:        {0:F4}", levene.PValue);
Console.WriteLine("Reject null hypothesis? {0}", 
    levene.Reject() ? "yes" : "no");
Visual Basic CopyCode imageCopy Code
Dim levene As LeveneTest = New LeveneTest(variables)
Console.WriteLine("Test statistic: {0:F4}", levene.Statistic)
Console.WriteLine("P-value:        {0:F4}", levene.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
    IIf(levene.Reject(), "yes", "no"))

The value of the F statistic is 1.7059 giving a p-value of 0.0991. As a result, the hypothesis that the variances are equal is not rejected at the 0.05 level.

The outcome of Levene's test is clearly different from that of Bartlett's test for the same data. The reason is most likely that the data are not distributed normally. Bartlett's test cannot distinguish non-homogeneity from departure from normality.

Once a LeveneTest object has been created, you can access other properties and methods common to all hypothesis test classes. For instance, to obtain the critical values for a significance level of 0.05 and 0.1, the code would be:

C# CopyCode imageCopy Code
Console.WriteLine("Critical value: {0:F4} at 95%", 
    levene.GetUpperCriticalValue(0.05));
Console.WriteLine("Critical value: {0:F4} at 90%", 
    levene.GetUpperCriticalValue(0.1));
Visual Basic CopyCode imageCopy Code
Console.WriteLine("Critical value: {0:F4} at 95%", _
    levene.GetUpperCriticalValue(0.05))
Console.WriteLine("Critical value: {0:F4} at 90%", _
    levene.GetUpperCriticalValue(0.1))

The values of the critical values (1.9856 at 0.05 and 1.7021 at 0.01) show that the null hypothesis will not be rejected at the 0.05 level.

Up: Hypothesis Tests Next: Histograms Previous: Testing Goodness-of-Fit Contents

Overview
Introduction
Features
Documentation
QuickStart Samples
Sample Applications
Downloads
Get it now!
Download trial version
How to Buy
Information
Resources
Contact Us
Search

"The Extreme Optimization Statistics Library for .NET is a major boon for those doing statistical work in .NET. I strongly recommend this product."
- Marc Brooks

"I have made it my mission to institutionalize the value of good API design.  I strongly believe that this is key to making developers more productive and happy on our platform. It is clear that you value good API design in your work, and take to heart developer productivity and synergy with the .NET framework."
- Brad Abrams,
Lead Program Manager, Microsoft.

This is a partial list of companies who are using our libraries:
ABB Robotics
Allstate
Applied Materials
Arcam
Astra Schedule
Babson College
Canadian Council on Learning
Canyon Associates
Caxton Associates
CECity
Constellation Energy
CreditSights
DeepOcean
Duke University
Dynamotive
Elecsoft
Engelhard Corporation
Epcor
Equipoise Software
Galileo International
GAM UK
Gammex
GlaxoSmithKline
Global Matrix
The Hartford
Infinera Corporation
Intel
JDS Uniphase
LaBranche & Co.
Learning & Skills Council
Jacobs Consultancy
Litman Gregory
Lucas Systems
Malvern Instruments
Medrio
Merck & Co.
Mintera.
Monitor Software
MorningStar
NanoString Technologies
Paletta Invent
Parametric Portfolio Associates
Prosanos
RATA Associates
RiskShield
Ramboll
Standard & Poor's
Strategic Analysis Corporation
Univ. of Alicante
Univ. of South Carolina
vielife
Xerox
US Army