Extreme Optimization > User's Guide > Statistics Library > Hypothesis Tests > Testing Variances

Extreme Optimization User's Guide

User's Guide

Up: Hypothesis Tests Next: Testing Goodness-of-Fit Previous: Testing Means Contents

Testing Variances

The one sample chi-square test is used to test the hypothesis that the population from which a sample is drawn has a specified variance or standard deviation. The F test compares the variances of two samples. Other tests, such as Levene's test and Bartlett's test, are used to compare the variances of multiple samples. They are covered in their own section on testing equality of variances.

The One Sample Chi-Square Test

The one sample chi-square test is used to test the hypothesis that a sample comes from a population with a specified variance or standard deviation. The test is based on the assumption that the sample is randomly selected from the population, and that the population itself follows a normal distribution. If either of these assumptions is violated, the reliability of the chiSquare test may be compromised.

The null hypothesis is always that the population underlying the sample has a variance that is equal to the proposed variance. The alternative hypothesis depends on whether a one or two-tailed test is performed.

For the one-tailed test, the alternative hypothesis is that the population from which the sample was drawn has a variance that is either less than (lower tailed) or greater than (upper tailed) the proposed variance. For the two-tailed version, the alternative hypothesis is that the variance of the population does not equal the the proposed variance.

This test should not be confused with the chi-square goodness-of-fit test, which is used to verify that a sample comes from a specified distribution.

The one sample chi-square test is implemented by the OneSampleChiSquareTest class. It has five constructors. The first constructor takes no arguments. The data and conditions for the test must be specified by setting properties of the OneSampleChiSquareTest object.

The remaining four constructors can be divided into two pairs.

The second and third constructors take 2 or 3 arguments. The first argument is a NumericalVariable that specifies the sample. The second argument is the proposed variance. The third, optional argument is a HypothesisType value that specifies whether the test is one or two-tailed. The default value is HypothesisType.TwoTailed.

The last two constructors take 3 or 4 arguments. The first argument is an integer that specifies the size of the sample. The second argument specifies the variance of the sample. The third parameter specifies the variance to test again. The optional fourth argument is a HypothesisType value that specifies whether the test is one or two-tailed. The default value is HypothesisType.TwoTailed.

Example

The test scores of a class on a national test are as follows:

61, 77, 61, 90, 72, 51, 75, 83, 53, 82, 82, 66, 68, 57, 61, 61, 78, 69, 65.

We want to investigate if the variance of the scores is greater than 5. The following code performs the test:

C# CopyCode imageCopy Code
double[] group1Data = new double[]
    {62, 77, 61, 94, 75, 82, 86, 83, 64, 84, 
     68, 82, 72, 71, 85, 66, 61, 79, 81, 73};        
NumericalVariable results = new NumericalVariable("Class 1", group1Data);
OneSampleChiSquareTest chiSquareTest =    new OneSampleChiSquareTest();
chiSquareTest.Sample = results;
chiSquareTest.Variance = 5;
Console.WriteLine("Test statistic: {0:F4}", chiSquareTest.Statistic);
Console.WriteLine("P-value:        {0:F4}", chiSquareTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}", 
    chiSquareTest.Reject() ? "yes" : "no");
Visual Basic CopyCode imageCopy Code
Dim group1Data As Double() = New Double() _
    {62, 77, 61, 94, 75, 82, 86, 83, 64, 84,
     68, 82, 72, 71, 85, 66, 61, 79, 81, 73}
Dim group1Results As NumericalVariable = _
   New NumericalVariable("Class 1", group1Data)
Dim chiSquareTest As OneSampleChiSquareTest = _
    New OneSampleChiSquareTest()
chiSquareTest.Sample = results
chiSquareTest.Variance = 5
Console.WriteLine("Test statistic: {0:F4}", chiSquareTest.Statistic)
Console.WriteLine("P-value:        {0:F4}", chiSquareTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
    IIf(chiSquareTest.Reject(), "yes", "no"))

The value of the chi square statistic turns out to be -2.4505 giving a p-value of 0.0143. As a result, the hypothesis that on average, the students in this class score no different than the national average is rejected at the 0.05 level.

Using pre-calculated values for the mean and sample size, the above example would look like this:

C# CopyCode imageCopy Code
double mean = 75.3
int sampleSize = 20;
OneSampleChiSquareTest chiSquareTest =    new OneSampleChiSquareTest(mean, sampleSize, 79.3, 7.3);
Console.WriteLine("Test statistic: {0:F4}", chiSquareTest.Statistic);
Console.WriteLine("P-value:        {0:F4}", chiSquareTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}", 
    chiSquareTest.Reject() ? "yes" : "no");
Visual Basic CopyCode imageCopy Code
Dim mean As Double = 75.3
Dim sampleSize As Integer = 20
Dim chiSquareTest As OneSampleChiSquareTest = _
    New OneSampleChiSquareTest(mean, sampleSize, 79.3, 7.3)
Console.WriteLine("Test statistic: {0:F4}", chiSquareTest.Statistic)
Console.WriteLine("P-value:        {0:F4}", chiSquareTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
    IIf(chiSquareTest.Reject(), "yes", "no"))

Once a OneSampleChiSquareTest object has been created, you can access other properties and methods common to all hypothesis test classes. For instance, to obtain a 95% confidence interval around the mean, the code would be:

C# CopyCode imageCopy Code
Interval meanInterval = chiSquareTest.GetConfidenceInterval();
Console.WriteLine("95% Confidence interval for the mean: {0:F1} - {1:F1}", 
    meanInterval.LowerBound, meanInterval.UpperBound);
Visual Basic CopyCode imageCopy Code
Dim meanInterval As Interval = chiSquareTest.GetConfidenceInterval()
Console.WriteLine("95% Confidence interval for the mean: {0:F1} - {1:F1}", _
    meanInterval.LowerBound, meanInterval.UpperBound)

The confidence interval for the mean is 72.1 and 78.5 at the 95% confidence level.

The F-Test

The F-test is a two sample test that is used to test the hypothesis that the variances of two populations are equal. The test is based on the assumption that the sample is randomly selected from the population, and that the population itself follows a normal distribution. If either of these assumptions is violated, the reliability of the chiSquare test may be compromised.

The null hypothesis is always that the population underlying the sample has a variance that is equal to the proposed variance. The alternative hypothesis depends on whether a one or two-tailed test is performed.

For the one-tailed test, the alternative hypothesis is that the population from which the sample was drawn has a variance that is either less than (lower tailed) or greater than (upper tailed) the proposed variance. For the two-tailed version, the alternative hypothesis is that the variance of the population does not equal the the proposed variance.

The F test was named in honor of Sir Ronal Fisher who created the foundations for much of modern statistical analysis.

The F test is implemented by the FTest class. It has five constructors.

The first constructor takes no arguments. All test parameters must be provided by setting the properties of the FTest object.

The remaining four constructors can be divided into two pairs. The first pair has 2 or 3 arguments. The first two arguments are NumericalVariable objects that represent the samples the test is to be applied to. The first constructor only has these two arguments. This creates a two-tailed test for equality of variances. The second constructor of the pair takes a third parameter: a HypothesisType value that specifies whether the test is one or two-tailed. One-tailed F tests are very common.

The second pair of constructors take 4 or 5 arguments. The first four arguments are, in order, the degrees of freedom and variance of the numerator, and the degrees of freedom and variance of the denominator. The fifth parameter, if present, is once again a HypothesisType value that specifies whether the test is one or two-tailed.

Example

Once again, we use the same data as before. However, this time we compare the results of one group of students to the results of a second group of students, with these test scores:

61, 80, 98, 90, 94, 65, 79, 75, 74, 86, 76, 85, 78, 72, 76, 79, 65, 92, 76, 80

We want to test if the variances of the two populations are equal. The code below performs this test:

C# CopyCode imageCopy Code
double[] group2Data = new double[]
    {61, 80, 98, 90, 94, 65, 79, 75, 74, 86, 
        76, 85, 78, 72, 76, 79, 65, 92, 76, 80};
NumericalVariable group2Results =    new NumericalVariable("Class 2", group2Data);
FTest fTest = new FTest(group1Results, group2Results);
Console.WriteLine("Test statistic: {0:F4}", fTest.Statistic);
Console.WriteLine("P-value:        {0:F4}", fTest.Probability);
Console.WriteLine("Reject null hypothesis? {0}",
    fTest.Reject() ? "yes" : "no");
Visual Basic CopyCode imageCopy Code
Dim group2Data As Double() = _
    {61, 80, 98, 90, 94, 65, 79, 75, 74, 86, _
        76, 85, 78, 72, 76, 79, 65, 92, 76, 80}
Dim group2Results As NumericalVariable = _
    New NumericalVariable("Class 2", group2Data)
Dim fTest As FTest = New FTest(group1Results, group2Results)
Console.WriteLine("Test statistic: {0:F4}", fTest.Statistic)
Console.WriteLine("P-value:        {0:F4}", fTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
    IIf(fTest.Reject(), "yes", "no"))

The value of the F-statistic is 0.9573 giving a p-value of 0.5374. As a result, the hypothesis that the variance of the scores of students from the first group is no different than that of the second group is not rejected at the 0.05 level.

Up: Hypothesis Tests Next: Testing Goodness-of-Fit Previous: Testing Means Contents

Overview
Introduction
Features
Documentation
QuickStart Samples
Sample Applications
Downloads
Get it now!
Download trial version
How to Buy
Information
Resources
Contact Us
Search

"The Extreme Optimization Statistics Library for .NET is a major boon for those doing statistical work in .NET. I strongly recommend this product."
- Marc Brooks

"I have made it my mission to institutionalize the value of good API design.  I strongly believe that this is key to making developers more productive and happy on our platform. It is clear that you value good API design in your work, and take to heart developer productivity and synergy with the .NET framework."
- Brad Abrams,
Lead Program Manager, Microsoft.

This is a partial list of companies who are using our libraries:
ABB Robotics
Allstate
Applied Materials
Arcam
Astra Schedule
Babson College
Canadian Council on Learning
Canyon Associates
Caxton Associates
CECity
Constellation Energy
CreditSights
DeepOcean
Duke University
Dynamotive
Elecsoft
Engelhard Corporation
Epcor
Equipoise Software
Galileo International
GAM UK
Gammex
GlaxoSmithKline
Global Matrix
The Hartford
Infinera Corporation
Intel
JDS Uniphase
LaBranche & Co.
Learning & Skills Council
Jacobs Consultancy
Litman Gregory
Lucas Systems
Malvern Instruments
Medrio
Merck & Co.
Mintera.
Monitor Software
MorningStar
NanoString Technologies
Paletta Invent
Parametric Portfolio Associates
Prosanos
RATA Associates
RiskShield
Ramboll
Standard & Poor's
Strategic Analysis Corporation
Univ. of Alicante
Univ. of South Carolina
vielife
Xerox
US Army