It is often necessary to verify whether the distribution of a variable
fits a certain theoretical distribution.
Goodness-of-fit tests can be used to perform this verification.
Goodness of fit tests require all sample values.
They can't be performed using only the summary statistics.
The Chi-Square Test for Goodness-of-Fit
The chi-square goodness-of-fit test compares observed cell frequencies from a sample
with the cell frequencies expected from the proposed underlying distribution.
The test is based on the assumption that the variable is categorical in nature.
When the variable is continuous, the chi-square test cannot be used directly.
It is possible to group the data into cells and use the categorized data in the test.
The outcome of the test depends on how the continuous data is grouped,
so it may not be as reliable.
Two other assumptions are made, namely that the sample is randomly selected
from the population, and that the expected frequency of each cell is large enough.
A common lower limit is 5. If either of these assumptions is violated, the reliability of the
chi-square test may be compromised.
The test statistic is calculated from the difference between the expected
and actual cell frequencies. The distribution of the statistic is approximated
by the chi square distribution. The approximation is better the higher
the expected cell frequencies.
The null hypothesis is that the observed cell frequencies are equal to
the expected frequencies. The alternative hypothesis is
that at least one cell frequency is different from its expected value.
This test should not be confused with the chi-square test
for the variance of a distribution.
The chi-square goodness-of-fit test is implemented by the
OneSampleChiSquareTest class.
Example 1 - Fitting a Discrete Distribution
In a gambling game, the payout is directly proportional to
the number of sixes that are thrown. A very successful
customer has the following results:
# sixes | # throws |
---|
0 | 52 |
1 | 35 |
2 | 11 |
3 | 2 |
The casino management suspects that the customer may be using weighted dice.
The significance level for this test is 0.01.
The number of sixes thrown follow a binomial distribution
with p = 1/6. The expected values can be
calculated easily using the
GetExpectedHistogram
method of
the BinomialDistribution. We then compare
the results to the actual:
var sixesDistribution = new BinomialDistribution(3, 1 / 6.0);
var expected = sixesDistribution.GetExpectedHistogram(100);
var actual = Vector.Create<double>(51, 35, 12, 2);
var chiSquare = new ChiSquareGoodnessOfFitTest(actual, expected);
chiSquare.SignificanceLevel = 0.01;
Console.WriteLine("Test statistic: {0:F4}", chiSquare.Statistic);
Console.WriteLine("P-value: {0:F4}", chiSquare.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
chiSquare.Reject() ? "yes" : "no");
Dim sixesDistribution = New BinomialDistribution(3, 1 / 6.0)
Dim expected = sixesDistribution.GetExpectedHistogram(100)
Dim actual = Vector.Create(Of Double)(51, 35, 12, 2)
Dim chiSquare = New ChiSquareGoodnessOfFitTest(actual, expected)
chiSquare.SignificanceLevel = 0.01
Console.WriteLine("Test statistic: {0:F4}", chiSquare.Statistic)
Console.WriteLine("P-value: {0:F4}", chiSquare.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
IIf(chiSquare.Reject(), "yes", "no"))
No code example is currently available or this language may not be supported.
let sixesDistribution = BinomialDistribution(3, 1.0 / 6.0)
let expected = sixesDistribution.GetExpectedHistogram(100.0)
let actual = Vector.Create(51., 35., 12., 2.)
let chiSquare = ChiSquareGoodnessOfFitTest(actual, expected)
chiSquare.SignificanceLevel <- 0.01
printfn "Test statistic: %.4f" chiSquare.Statistic
printfn "P-value: %.4f" chiSquare.PValue
printfn "Reject null hypothesis? %s"
(if chiSquare.Reject() then "yes" else "no")
The value of the chi-square statistic is 9.6013 giving a p-value of 0.0223. As a result, the hypothesis that the
dice are weighted is rejected at the 0.01 level.
The One Sample Kolmogorov-Smirnov Test
The one sample Kolmogorov-Smirnov test (KS test) is a one sample test
that is used to test the hypothesis that a given sample was taken
from a proposed continuous distribution. The test statistic is based
on a comparison of the empirical distribution of the sample to the proposed distribution.
One of the advantages of the KS test is that it can be applied to
any continuous distribution. On the other hand, it can't be applied
to discrete distributions, and is more sensitive near the center
of the distribution than at the tails.
The biggest drawback is that the distribution must be completely specified.
If one or more of the distribution's parameters is estimated,
the distribution of the test statistic is different from the Kolmogorov-Smirnov
distribution.
The null hypothesis is always that the population underlying the sample
has the proposed distribution. The alternative hypothesis is
that the population does not have the proposed distribution.
There is also a two sample Kolmogorov-Smirnov test,
which is used to test whether two samples were taken from the same,
unknown distribution.
The one sample Kolmogorov-Smirnov test is implemented by the
OneSampleKolmogorovSmirnovTest
class. It has three constructors. The first constructor takes no arguments.
All test parameters must be specified through properties of the test object.
The second constructor takes two arguments.
The first is a VectorT
object that specifies the sample. The second is a
FuncT, TResult delegate,
which specifies the cumulative distribution function of the distribution being tested.
The third constructor also takes two arguments. The first argument is once again
vector. The second argument must be of a type derived from
ContinuousDistribution.
In this example, we take samples of a lognormal distribution, and test whether it could come from a similar
looking Weibull distribution.
var weibull = new WeibullDistribution(2, 1);
var logNormal = new LognormalDistribution(0, 1);
var logNormalSample = logNormal.Sample(25);
var ksTest = new OneSampleKolmogorovSmirnovTest(logNormalSample, weibull);
Console.WriteLine("Test statistic: {0:F4}", ksTest.Statistic);
Console.WriteLine("P-value: {0:F4}", ksTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
ksTest.Reject() ? "yes" : "no");
Dim weibull = New WeibullDistribution(2, 1)
Dim logNormal = New LognormalDistribution(0, 1)
Dim logNormalSample = logNormal.Sample(25)
Dim ksTest = New OneSampleKolmogorovSmirnovTest(logNormalSample, weibull)
Console.WriteLine("Test statistic: {0:F4}", ksTest.Statistic)
Console.WriteLine("P-value: {0:F4}", ksTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
IIf(ksTest.Reject(), "yes", "no"))
No code example is currently available or this language may not be supported.
let weibull = WeibullDistribution(2.0, 1.0)
let logNormal = LognormalDistribution(0.0, 1.0)
let logNormalSample = logNormal.Sample(25)
let ksTest = OneSampleKolmogorovSmirnovTest(logNormalSample, weibull)
printfn "Test statistic: %.4f" ksTest.Statistic
printfn "P-value: %.4f" ksTest.PValue
printfn "Reject null hypothesis? %s"
(if ksTest.Reject() then "yes" else "no")
First we create a Weibull and a lognormal distribution.
We then get 25 random samples from the lognormal distribution using its
Sample(Int32)
method.
Because we use random samples, the results of the test are different on each run.
The trend is that the p-value is anywhere from 0.03 to 0.3. We can conclude
from this that it is not possible to distinguish a lognormal distribution
from a Weibull distribution using only 25 sample points.
The Two Sample Kolmogorov-Smirnov Test
The two sample Kolmogorov-Smirnov test is used to test the hypothesis
that two samples come from a population with the same, unknown distribution.
The null hypothesis is always that the two samples come
from the same underlying distribution. The alternative
hypothesis is always that the two samples come from different distributions.
The two sample Kolmogorov-Smirnov test is implemented by the
TwoSampleKolmogorovSmirnovTest
class. It has two constructors. The first constructor takes no arguments.
The second constructor takes two arguments. Both are vectors
that specify the two samples that are being compared.
We investigate whether we can distinguish a sample taken from
a lognormal distribution from a sample taken from a similar looking
Weibull distribution. We use the lognormal samples we created in the previous section.
var weibullSample = weibull.Sample(25);
var ksTest2 = new TwoSampleKolmogorovSmirnovTest(logNormalSample, weibullSample);
Console.WriteLine("Test statistic: {0:F4}", ksTest2.Statistic);
Console.WriteLine("P-value: {0:F4}", ksTest2.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
ksTest2.Reject() ? "yes" : "no");
Dim weibullSample = weibull.Sample(25)
Dim ksTest2 = New TwoSampleKolmogorovSmirnovTest(logNormalSample, weibullSample)
Console.WriteLine("Test statistic: {0:F4}", ksTest2.Statistic)
Console.WriteLine("P-value: {0:F4}", ksTest2.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
IIf(ksTest2.Reject(), "yes", "no"))
No code example is currently available or this language may not be supported.
let weibullSample = weibull.Sample(25)
let ksTest2 = TwoSampleKolmogorovSmirnovTest(logNormalSample, weibullSample)
printfn "Test statistic: %.4f" ksTest2.Statistic
printfn "P-value: %.4f" ksTest2.PValue
printfn "Reject null hypothesis? %s"
(if ksTest2.Reject() then "yes" else "no")
The Anderson-Darling Test for Normality
The Anderson-Darling test is a one sample test of normality.
It is a variation of the Kolmogorov-Smirnov test that assigns more weight
to the tails of the distribution. Unlike the Kolmogorov-Smirnov test,
the distribution of the test statistic is dependent on the distribution.
The parameters of the distribution are estimated from the sample.
The null hypothesis is always that the population underlying the sample
follows a normal distribution. The alternative hypothesis is always
that the underlying population does not follow a normal distribution.
The Anderson-Darling test is implemented by the
AndersonDarlingTest
class. It has three constructors. The first constructor has no arguments.
The second constructor has one argument: a vector that specifies the sample to be tested.
The third constructor takes three arguments. The first is once again a vector
that specifies the sample. The second and third arguments are the mean and
standard deviation of the normal distribution being tested.
If no values are provided, the values are estimated from the sample.
We investigate the strength of polished airplane windows. We want to verify
that the measured strengths follow a normal distribution.
We have a total of 31 samples.
var strength = Vector.Create(
18.830, 20.800, 21.657, 23.030, 23.230, 24.050,
24.321, 25.500, 25.520, 25.800, 26.690, 26.770,
26.780, 27.050, 27.670, 29.900, 31.110, 33.200,
33.730, 33.760, 33.890, 34.760, 35.750, 35.910,
36.980, 37.080, 37.090, 39.580, 44.045, 45.290,
45.381);
var adTest = new AndersonDarlingTest(strength, 30.81, 7.38);
Console.WriteLine("Test statistic: {0:F4}", adTest.Statistic);
Console.WriteLine("P-value: {0:F4}", adTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
adTest.Reject() ? "yes" : "no");
Dim strength = Vector.Create(
18.83, 20.8, 21.657, 23.03, 23.23, 24.05,
24.321, 25.5, 25.52, 25.8, 26.69, 26.77,
26.78, 27.05, 27.67, 29.9, 31.11, 33.2,
33.73, 33.76, 33.89, 34.76, 35.75, 35.91,
36.98, 37.08, 37.09, 39.58, 44.045, 45.29,
45.381)
Dim adTest = New AndersonDarlingTest(strength, 30.81, 7.38)
Console.WriteLine("Test statistic: {0:F4}", adTest.Statistic)
Console.WriteLine("P-value: {0:F4}", adTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
IIf(adTest.Reject(), "yes", "no"))
No code example is currently available or this language may not be supported.
let strength = Vector.Create(
18.830, 20.800, 21.657, 23.030, 23.230, 24.050,
24.321, 25.500, 25.520, 25.800, 26.690, 26.770,
26.780, 27.050, 27.670, 29.900, 31.110, 33.200,
33.730, 33.760, 33.890, 34.760, 35.750, 35.910,
36.980, 37.080, 37.090, 39.580, 44.045, 45.290,
45.381)
let adTest = AndersonDarlingTest(strength, 30.81, 7.38)
printfn "Test statistic: %.4f" adTest.Statistic
printfn "P-value: %.4f" adTest.PValue
printfn "Reject null hypothesis? %s"
(if adTest.Reject() then "yes" else "no")
The value of the Anderson-Darling statistic is 0.5128, corresponding to a p-value of 0.1795. We conclude that the
window strengths do follow a normal distribution.
The Shapiro-Wilk Test for Normality
The Shapiro-Wilk test is a one sample test of normality.
The parameters of the distribution are estimated from the sample.
The Shapiro-Wilk test is generally considered more reliable than
the Anderson-Darling or Kolmogorov-Smirnov test.
It is valid for sample sizes between 3 and 5000.
The null hypothesis is always that the population underlying the sample
follows a normal distribution. The alternative hypothesis is always
that the underlying population does not follow a normal distribution.
The Shapiro-Wilk test is implemented by the
ShapiroWilkTest
class. It has two constructors. The first constructor has no arguments.
The second constructor has one argument: a vector that specifies the sample to be tested.
As above, we investigate the strength of polished airplane windows.
We want to verify that the measured strengths follow a normal distribution.
We have a total of 31 samples.
var swTest = new ShapiroWilkTest(strength);
Console.WriteLine("Test statistic: {0:F4}", swTest.Statistic);
Console.WriteLine("P-value: {0:F4}", swTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
swTest.Reject() ? "yes" : "no");
Dim swTest = New ShapiroWilkTest(strength)
Console.WriteLine("Test statistic: {0:F4}", swTest.Statistic)
Console.WriteLine("P-value: {0:F4}", swTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
IIf(swTest.Reject(), "yes", "no"))
No code example is currently available or this language may not be supported.
let swTest = ShapiroWilkTest(strength)
printfn "Test statistic: %.4f" swTest.Statistic
printfn "P-value: %.4f" swTest.PValue
printfn "Reject null hypothesis? %s"
(if swTest.Reject() then "yes" else "no")
The value of the Shapiro-Wilk statistic is 0.9511, corresponding to
a p-value of 0.1675. We conclude that the window strengths
do follow a normal distribution.