Extreme Optimization >
User's Guide >
Statistics Library >
Hypothesis Tests >
Testing Means
Extreme Optimization User's Guide
User's Guide
Up: Hypothesis Tests Next: Testing Variances Previous: Hypothesis Test Basics Contents
Testing Means
There are two common tests of the hypothesis that the mean of a
sample comes from a distribution with a specified mean. One test,
the one sample z test, is used when the standard deviation
or the variance of the population is known. The other, the one
sample t test, is used when the variance of the population
is not known. The t test also has a two sample version,
which tests whether the difference between the means of two samples
is equal to a given value.
The One Sample z Test
The one sample z test is used to test the hypothesis
that a sample comes from a population with a specified mean when
the variance or standard deviation is known. The test is based on
the assumption that the sample is randomly selected from the
population, and that the population itself follows a normal
distribution. If either of these assumptions is violated, the
reliability of the z test may be compromised.
The null hypothesis is always that the population underlying the
sample has a mean that is equal to the proposed mean. The
alternative hypothesis depends on whether a one or two-tailed test
is performed.
For the one-tailed test, the alternative hypothesis is that the
population from which the sample was drawn has a mean that is
either less than (lower tailed) or greater than (upper tailed) the
proposed mean. For the two-tailed version, the alternative
hypothesis is that the mean of the population does not equal the
the proposed mean.
The one sample z test is implemented by the
OneSampleZTest
class. It has four constructors in all, which can be grouped in two
pairs.
The first two constructors take 4 or 5 arguments. The first two
arguments are he sample mean and the sample size. The next two
arguments are the population mean and the population standard
deviation. If present, the fifth argument is a HypothesisType
value that specifies whether the test is one or two-tailed. The
default value is HypothesisType.TwoTailed.
The second pair of constructors take 3 or 4 arguments. The first
argument is a NumericalVariable
that contains the sample data. The next two arguments are once
again the population mean and standard deviation. The fourth
argument, if present, is a HypothesisType
value that specifies whether the test is one or two-tailed. The
default value is HypothesisType.TwoTailed.
Example
The test scores of a class on a national test are as
follows:
61, 77, 61, 90, 72, 51, 75, 83, 53, 82, 82, 66, 68, 57, 61, 61,
78, 69, 65.
We want to investigate if the mean of this class is
significantly different from the national average, 79.3. The
standard deviation is 7.3, The following code performs the
test:
| C# | Copy Code |
double[] group1Data = new double[]
{62, 77, 61, 94, 75, 82, 86, 83, 64, 84,
68, 82, 72, 71, 85, 66, 61, 79, 81, 73};
NumericalVariable results = new NumericalVariable("Class 1", group1Data);
OneSampleZTest zTest = new OneSampleZTest(results, 79.3, 7.3);
Console.WriteLine("Test statistic: {0:F4}", zTest.Statistic);
Console.WriteLine("P-value: {0:F4}", zTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
zTest.Reject() ? "yes" : "no"); |
| Visual Basic | Copy Code |
Dim group1Data As Double() = New Double() _
{62, 77, 61, 94, 75, 82, 86, 83, 64, 84,
68, 82, 72, 71, 85, 66, 61, 79, 81, 73}
Dim group1Results As NumericalVariable = _
New NumericalVariable("Class 1", group1Data)
Dim zTest As OneSampleZTest = New OneSampleZTest(group1Results, 79.3, 7.3)
Console.WriteLine("Test statistic: {0:F4}", zTest.Statistic)
Console.WriteLine("P-value: {0:F4}", zTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
IIf(zTest.Reject(), "yes", "no")) |
The value of the z-statistic turns out to be -2.4505 giving a
p-value of 0.0143. As a result, the hypothesis that on average, the
students in this class score no different than the national average
is rejected at the 0.05 level.
Using pre-calculated values for the mean and sample size, the
above example would look like this:
| C# | Copy Code |
double mean = 75.3
int sampleSize = 20;
OneSampleZTest zTest = new OneSampleZTest(mean, sampleSize, 79.3, 7.3);
Console.WriteLine("Test statistic: {0:F4}", zTest.Statistic);
Console.WriteLine("P-value: {0:F4}", zTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
zTest.Reject() ? "yes" : "no"); |
| Visual Basic | Copy Code |
Dim mean As Double = 75.3
Dim sampleSize As Integer = 20
Dim zTest As OneSampleZTest = New OneSampleZTest(mean, sampleSize, 79.3, 7.3)
Console.WriteLine("Test statistic: {0:F4}", zTest.Statistic)
Console.WriteLine("P-value: {0:F4}", zTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
IIf(zTest.Reject(), "yes", "no")) |
Once a OneSampleZTest object has been created, you
can access other properties and methods common to all hypothesis
test classes. For instance, to obtain a 95% confidence interval
around the mean, the code would be:
| C# | Copy Code |
Interval meanInterval = zTest.GetConfidenceInterval();
Console.WriteLine("95% Confidence interval for the mean: {0:F1} - {1:F1}",
meanInterval.LowerBound, meanInterval.UpperBound); |
| Visual Basic | Copy Code |
Dim meanInterval As Interval = zTest.GetConfidenceInterval()
Console.WriteLine("95% Confidence interval for the mean: {0:F1} - {1:F1}", _
meanInterval.LowerBound, meanInterval.UpperBound) |
The confidence interval for the mean is 72.1 and 78.5 at the 95%
confidence level.
The One Sample t Test
The one sample t test is used to test the hypothesis
that a sample comes from a population with a specified mean when
the variance or standard deviation is not known. The test
is based on the assumption that the sample is randomly selected
from the population, and that the population itself follows a
normal distribution. If either of these assumptions is violated,
the reliability of the t test may be compromised.
The null hypothesis is always that the population underlying the
sample has a mean that is equal to the proposed mean. The
alternative hypothesis depends on whether a one or two-tailed test
is performed.
For the one-tailed test, the alternative hypothesis is that the
population from which the sample was drawn has a mean that is
either less than (lower tailed) or greater than (upper tailed) the
proposed mean. For the two-tailed version, the alternative
hypothesis is that the mean of the population does not equal the
the proposed mean.
The one sample t test is implemented by the
OneSampleTTest
class. It has five constructors in all. The first constructor takes
no arguments. The source data must be specified by setting
properties of the object.
The remaining four can be grouped in two pairs. The first two
constructors take 3 or 4 arguments. The first two arguments are he
sample mean and the sample size. The next argument is the
population mean. If present, the fourth argument is a
HypothesisType
value that specifies whether the test is one or two-tailed. The
default value is HypothesisType.TwoTailed.
The second pair of constructors take 2 or 3 arguments. The first
argument is a NumericalVariable
that contains the sample data. The next argument is once again the
population mean. The third argument, if present, is a
HypothesisType
value that specifies whether the test is one or two-tailed. The
default value is HypothesisType.TwoTailed.
Example
We use the same data as in the earlier example for the one
sample z test, but this time we assume the standard
deviation of the population is not known.
| C# | Copy Code |
double[] group1Data = new double[]
{62, 77, 61, 94, 75, 82, 86, 83, 64, 84,
68, 82, 72, 71, 85, 66, 61, 79, 81, 73};
NumericalVariable results = new NumericalVariable("Class 1", group1Data);
OneSampleTTest tTest = new OneSampleTTest(results, 79.3);
Console.WriteLine("Test statistic: {0:F4}", tTest.Statistic);
Console.WriteLine("P-value: {0:F4}", tTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
tTest.Reject() ? "yes" : "no"); |
| Visual Basic | Copy Code |
Dim group1Data As Double() = New Double() _
{62, 77, 61, 94, 75, 82, 86, 83, 64, 84,
68, 82, 72, 71, 85, 66, 61, 79, 81, 73}
Dim group1Results As NumericalVariable = _
New NumericalVariable("Class 1", group1Data)
Dim tTest As OneSampleTTest = New OneSampleTTest(group1Results, 79.3)
Console.WriteLine("Test statistic: {0:F4}", tTest.Statistic)
Console.WriteLine("P-value: {0:F4}", tTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
IIf(tTest.Reject(), "yes", "no")) |
The value of the t-statistic turns is -1.8800 giving a p-value
of 0.0755. As a result, the hypothesis that on average, the
students in this class score no different than the national average
is not rejected at the 0.05 level.
The one-sample t test can also be performed using only the mean
and the size of the sample. The corresponding code for the above
example would look like this:
| C# | Copy Code |
double mean = 75.3
int sampleSize = 20;
OneSampleTTest tTest = new OneSampleTTest(mean, sampleSize, 79.3);
Console.WriteLine("Test statistic: {0:F4}", tTest.Statistic);
Console.WriteLine("P-value: {0:F4}", tTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
tTest.Reject() ? "yes" : "no"); |
| Visual Basic | Copy Code |
Dim mean As Double = 75.3
Dim sampleSize As Integer = 20
Dim tTest As OneSampleTTest = New OneSampleTTest(mean, sampleSize, 79.3)
Console.WriteLine("Test statistic: {0:F4}", tTest.Statistic)
Console.WriteLine("P-value: {0:F4}", tTest.PValue)
Console.WriteLine("Reject null hypothesis? {0}", _
IIf(tTest.Reject(), "yes", "no")) |
Once a OneSampleTTest object has been created, you
can access other properties and methods common to all hypothesis
test classes. For instance, to obtain a 95% confidence interval
around the mean, the code would be:
| C# | Copy Code |
Interval meanInterval = zTest.GetConfidenceInterval();
Console.WriteLine("95% Confidence interval for the mean: {0:F1} - {1:F1}",
meanInterval.LowerBound, meanInterval.UpperBound); |
| Visual Basic | Copy Code |
Dim meanInterval As Interval = zTest.GetConfidenceInterval()
Console.WriteLine("95% Confidence interval for the mean: {0:F1} - {1:F1}", _
meanInterval.LowerBound, meanInterval.UpperBound) |
Note that this interval (70.8-79.8) is wider than for the
one-sample z test. The reason is that the uncertainty in the
standard deviation of the population causes an increase in the
uncertainty in the mean.
The Two Sample t Test
The two sample t test is used to test the hypothesis
that two samples are drawn from a population with the same mean.
The test is based on the assumption that the samples are randomly
selected from the populations, and that the populations themselves
follow a normal distribution. A third assumption states that the
variances of the populations underlying each of the samples are
equal. If any of these three assumptions is violated, the
reliability of the z test may be compromised.
The null hypothesis is always that the difference between the
means of the populations from which the samples were taken is equal
to a specific value, which may be zero. The alternative hypothesis
depends on whether a one or two-tailed test is performed.
For the one-tailed test, the alternative hypothesis is that the
difference between the means is less than (lower tailed) or greater
than (upper tailed) the proposed value. For the two-tailed version,
the alternative hypothesis is that the difference between the means
of the two populations does not equal the proposed value.
There is a further distinction between a paired and an unpaired
test. In an unpaired test, the two samples are independent from
each other. The populations represent two entirely independent
properties. In the paired test, the two samples represent two
properties of each subject from a single population. For example,
test scores for two different groups would require an unpaired
test. Test scores for a single group on two different tests would
require a paired test.
For example, two samples of the heart rate of two independent
groups of subjects are independent. The mean heart rates of the two
groups should be compared using the unpaired test. Two sets of
heart rate measurements of the same subjects, before and after some
physical activity, are dependent. The mean heart rates should be
compared using the paired test.
Another distinction is whether the variances of the two samples
are assumed to be equal or not. Equal variances lead to simpler
formulas.
The two sample t test is implemented by the
TwoSampleTTest
class. There are five constructors in all, reflecting the different
variations of the test.
The first constructor takes no arguments. All test parameters
must be provided by setting the properties of the
TwoSampleTTest object.
The first two arguments of each constructor are NumericalVariable
objects that represent the samples the test is to be applied to.
The first constructor only has these two arguments. This creates an
unpaired test for equality of the means. The variances are
estimated from the sample data. The second constructor takes a
third parameter that specifies the proposed difference between the
two means. This value is positive if the mean of the first sample
is greater than the mean of the second sample. If omitted, the
difference is taken to be zero.
The third and fourth constructors are similar to the first two,
but take two additional parameters. The first additional parameter
is a SamplePairing
value that specifies whether the test is paired or unpaired. A
value of SamplePairing.Paired produces a paired test.
A value of SamplePairing.Unpaired produces an unpaired
test.
The second additional parameter is a VarianceAssumption
value. It is only meaningful for unpaired tests. A value of
VarianceAssumption.AssumeEqual indicates that the
variance of the two samples should be assumed to be equal, which
results in somewhat simpler calculations.
Example of an unpaired test
Once again, we use the same data as before. However, this time
we compare the results of one group of students to the results of a
second group of students, with these test scores:
61, 80, 98, 90, 94, 65, 79, 75, 74, 86, 76, 85, 78, 72, 76, 79,
65, 92, 76, 80
The code below performs the unpaired two-sample t-test:
| C# | Copy Code |
double[] group2Data = new double[]
{61, 80, 98, 90, 94, 65, 79, 75, 74, 86,
76, 85, 78, 72, 76, 79, 65, 92, 76, 80};
NumericalVariable group2Results = new NumericalVariable("Class 2", group2Data);
TwoSampleTTest tTest2 = new TwoSampleTTest(group1Results, group2Results,
SamplePairing.Unpaired, VarianceAssumption.None);
Console.WriteLine("Test statistic: {0:F4}", tTest2.Statistic);
Console.WriteLine("P-value: {0:F4}", tTest2.Probability);
Console.WriteLine("Reject null hypothesis? {0}",
tTest2.Reject() ? "yes" : "no"); |
| Visual Basic | Copy Code |
Dim group2Data As Double() = _
{61, 80, 98, 90, 94, 65, 79, 75, 74, 86, _
76, 85, 78, 72, 76, 79, 65, 92, 76, 80}
Dim group2Results As NumericalVariable = _
New NumericalVariable("Class 2", group2Data)
Dim tTest2 As TwoSampleTTest = New TwoSampleTTest(group1Results, group2Results, _
SamplePairing.Unpaired, VarianceAssumption.None)
Console.WriteLine("Test statistic: {0:F4}", tTest2.Statistic)
Console.WriteLine("P-value: {0:F4}", tTest2.PValue)
Console.WriteLine("Reject null hypothesis? {0}",
IIf(tTest2.Reject(), "yes", "no")) |
The value of the t-statistic is -1.4337 giving a p-value of
0.1598. As a result, the hypothesis that on average, the students
in the first group score no different than the students in the
second group is not rejected at the 0.05 level.
Up: Hypothesis Tests Next: Testing Variances Previous: Hypothesis Test Basics Contents
Copyright 2004-2008,
Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M
Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual
Studio.NET, and the Visual Studio Logo are registered trademarks of Microsoft Corporation