An ANOVA design with two factors is called a two-way analysis of variance.
The two-way analysis of variance is implemented by the
TwoWayAnovaModel class.
Constructing Two-Way ANOVA Models
The TwoWayAnovaModel
class has two constructors. The first constructor takes three arguments:
a VectorT
that specifies the dependent variable, and two
CategoricalVectorT
objects that specify the independent variables.
All three variables must have the same number of observations.
As an example, we construct an ANOVA model to measure the effect
of package color and shape on sales. Our data comes from 12 stores.
The two categorical variables are the shape color and the shape.
The dependent variable is the total sales of the product in the store.
var colors = Vector.CreateCategorical(new[] {
"Blue", "Blue", "Blue", "Blue",
"Red", "Red", "Red", "Red",
"Green", "Green", "Green", "Green" });
var shapes = Vector.CreateCategorical(new[] {
"Square", "Square", "Rectangle", "Rectangle",
"Square", "Square", "Rectangle", "Rectangle",
"Square", "Square", "Rectangle", "Rectangle" });
var sales = Vector.Create(new[] {
6.0, 14.0, 19.0, 17.0,
18.0, 11.0, 20.0, 23.0,
7.0, 11.0, 18.0, 10.0});
var anova1 = new TwoWayAnovaModel(sales, colors, shapes);
Dim colors = Vector.CreateCategorical({
"Blue", "Blue", "Blue", "Blue",
"Red", "Red", "Red", "Red",
"Green", "Green", "Green", "Green"})
Dim shapes = Vector.CreateCategorical({
"Square", "Square", "Rectangle", "Rectangle",
"Square", "Square", "Rectangle", "Rectangle",
"Square", "Square", "Rectangle", "Rectangle"})
Dim sales = Vector.Create(
6.0, 14.0, 19.0, 17.0,
18.0, 11.0, 20.0, 23.0,
7.0, 11.0, 18.0, 10.0)
Dim anova1 = New TwoWayAnovaModel(sales, colors, shapes)
No code example is currently available or this language may not be supported.
let colors = Vector.CreateCategorical(
[| "Blue"; "Blue"; "Blue"; "Blue";
"Red"; "Red"; "Red"; "Red";
"Green"; "Green"; "Green"; "Green" |])
let shapes = Vector.CreateCategorical(
[| "Square"; "Square"; "Rectangle"; "Rectangle";
"Square"; "Square"; "Rectangle"; "Rectangle";
"Square"; "Square"; "Rectangle"; "Rectangle" |])
let sales = Vector.Create(
6., 14., 19., 17.,
18., 11., 20., 23.,
7., 11., 18., 10.0)
let anova1 = TwoWayAnovaModel(sales, colors, shapes)
The second constructor takes four arguments. The first argument is a
DataFrameR, C
that contains the variables you wish to use in the analysis.
The second argument is the name of the dependent variable in the
data frame. The third and fourth arguments are the names of the independent
variable in the data frame. Using the variables we created in the
previous example, we get:
var dataFrame = DataFrame.FromColumns(
new IVector[] { colors, shapes, sales },
Index.Create(new[] { "color", "shape", "sales" }));
var anova2 = new TwoWayAnovaModel(dataFrame, "sales", "color", "shape");
Dim frame = DataFrame.FromColumns(
New IVector() {colors, shapes, sales},
Index.Create({"color", "shape", "sales"}))
Dim anova2 = New TwoWayAnovaModel(frame, "sales", "color", "shape")
No code example is currently available or this language may not be supported.
let dataFrame =
DataFrame.FromColumns(
"color" => colors, "shape" => shapes, "sales" => sales)
let anova2 = TwoWayAnovaModel(dataFrame, "sales", "color", "shape")
The Compute
method performs the actual calculation.
The results of the analysis can be obtained through the model's
AnovaTable
property. The ANOVA table for a two-way design has five rows.
There are three rows that describe the contribution of the model to the variation.
There is one row for each of the factors and one row for the interaction
between the two factors. These can be retrieved through the
GetModelRow
method. As always, there is one row for the residuals, and one for the complete model.
Index 0 gives the row for the first factor, index 1 gives the row for the
second factor, and index 3 gives the row for the interaction.
The CompleteModelRow
property is not part of the ANOVA table. It shows the contribution
of the complete model to the variation.
The AnovaModelRow
objects obtained in this way show the results of the test for significance
of the variation due to the factor compared to the variation not explained by
the model. The FStatistic
property gives the value of the F statistic for this ratio, while the
PValue
gives the significance of the F statistic.
The Within Groups row shows the variation of the data
around the group means. It corresponds to the error or residual of the variation
in the data after the model has been taken into account. The row is available through
the ANOVA table's
TotalRow
property.
The Total row contains the summary data for the entire data set.
It can be retrieved through the
TotalRow
property of the ANOVA table.
The example below illustrates these properties:
anova1.Fit();
var anovaTable = anova1.AnovaTable;
Console.WriteLine("F statistic: {0}", anovaTable.CompleteModelRow.FStatistic);
Console.WriteLine("P-value : {0}", anovaTable.CompleteModelRow.PValue);
Console.WriteLine("Sum of sq. total: {0}",
anovaTable.TotalRow.SumOfSquares);
Console.WriteLine("Sum of sq. error: {0}",
anovaTable.ErrorRow.SumOfSquares);
Console.WriteLine("Sum of sq. color: {0}",
anovaTable.GetModelRow(0).SumOfSquares);
Console.WriteLine("Sum of sq. shape: {0}",
anovaTable.GetModelRow(1).SumOfSquares);
Console.WriteLine("Sum of sq. interaction: {0}",
anovaTable.GetModelRow(2).SumOfSquares);
Console.WriteLine(anovaTable.ToString());
anova1.Fit()
Dim anovaTable = anova1.AnovaTable
Console.WriteLine("F statistic: 0}", anovaTable.CompleteModelRow.FStatistic)
Console.WriteLine("P-value : 0}", anovaTable.CompleteModelRow.PValue)
Console.WriteLine("Sum of sq. total: 0}",
anovaTable.TotalRow.SumOfSquares)
Console.WriteLine("Sum of sq. error: 0}",
anovaTable.ErrorRow.SumOfSquares)
Console.WriteLine("Sum of sq. color: 0}",
anovaTable.GetModelRow(0).SumOfSquares)
Console.WriteLine("Sum of sq. shape: 0}",
anovaTable.GetModelRow(1).SumOfSquares)
Console.WriteLine("Sum of sq. interaction: 0}",
anovaTable.GetModelRow(2).SumOfSquares)
Console.WriteLine(anovaTable.ToString())
No code example is currently available or this language may not be supported.
anova1.Fit()
let anovaTable = anova1.AnovaTable
printfn "F statistic: %f" anovaTable.CompleteModelRow.FStatistic
printfn "P-value : %f" anovaTable.CompleteModelRow.PValue
printfn "Sum of sq. total: %f" anovaTable.TotalRow.SumOfSquares
printfn "Sum of sq. error: %f" anovaTable.ErrorRow.SumOfSquares
printfn "Sum of sq. color: %f" (anovaTable.GetModelRow(0).SumOfSquares)
printfn "Sum of sq. shape: %f" (anovaTable.GetModelRow(1).SumOfSquares)
printfn "Sum of sq. interaction: %f" (anovaTable.GetModelRow(2).SumOfSquares)
printfn "%O" anovaTable
For the example using the packaging, we find that the F statistic
for the color is 2.5049, corresponding to a p-value of 0.1619.
For the shape, we find an F statistic of 7.7670 with a p-value of 0.0317,
and for the interaction, we find an F statistic of 0.1359
with a p-value of 0.8755. The conclusion is that the color of the packaging
does not contribute significantly to the sales of the product,
but the impact of the shape is significant at the usual 0.05 level.
Type I, Type II, and Type III Sums of Squares
By default, the ANOVA table contains values computed with Type I sums of squares
(also called sequential sums of squares).
ANOVA calculations using other types of sums of squares.
The type can be selected by setting the
SumsOfSquaresType
property before the model is calculated. This is a
SumsOfSquaresType
enumeration value.
Even if the type was not set beforehand, all 3 types are available after the calculation
through three properties:
TypeISumsOfSquares,
TypeIISumsOfSquares,
and TypeIIISumsOfSquares.
These properties return an AnovaTable
object with three rows, one each for the first factor, the second factor, and their interaction.
var typeIII = anova1.TypeIIISumsOfSquares;
Console.WriteLine("Type III sums of squares:");
Console.WriteLine(typeIII);
Dim typeIII = anova1.TypeIIISumsOfSquares
Console.WriteLine("Type III sums of squares:")
Console.WriteLine(typeIII)
No code example is currently available or this language may not be supported.
let typeIII = anova1.TypeIIISumsOfSquares
printfn "Type III sums of squares:"
printfn "%O" typeIII
The group means can be accessed through the model's
Cells
property, which is a matrix of Cell
objects. In the example below, we first obtain the
CategoryIndex
for the color variable. We then iterate through the levels
of the index, and print the group means for the square boxes:
var colorFactor = colors.CategoryIndex;
var squareColumn = anova1.Cells.GetColumn("Square");
foreach (var level in colorFactor)
Console.WriteLine("Mean for square boxes group '{0}': {1:F4}",
level, squareColumn.Get(level).Mean);
Dim colorFactor = colors.CategoryIndex
Dim squareColumn = anova1.Cells.GetColumn("Square")
For Each level In colorFactor
Console.WriteLine(,
level, squareColumn.Get(level).Mean)
Next
No code example is currently available or this language may not be supported.
let colorFactor = colors.CategoryIndex
let squareColumn = anova1.Cells.GetColumn("Square")
for level in colorFactor do
printfn "Mean for square boxes group '%s': %.4f"
level (squareColumn.Get(level).Mean)
The RowTotals
property returns a vector of cells with totals for each row (colors in our example).
The ColumnTotals
property returns a vector of cells with totals for entire columns (shapes in our example).
Below, we print the total variance for all rectangular packages:
Console.WriteLine("Variance of square packages: {0:F4}",
anova1.ColumnTotals.Get("Rectangle").Variance);
Console.WriteLine("Variance of square packages: 0:F4}",
anova1.ColumnTotals.Get("Rectangle").Variance)
No code example is currently available or this language may not be supported.
printfn "Variance of square packages: %.4f"
(anova1.ColumnTotals.Get("Rectangle").Variance)
The TotalCell
property returns the cell with totals for the complete data.
The grand mean can be obtained from this cell:
Console.WriteLine("Grand mean: {0:F4}", anova1.TotalCell.Mean);
Console.WriteLine("Grand mean: 0:F4}", anova1.TotalCell.Mean)
No code example is currently available or this language may not be supported.
printfn "Grand mean: %.4f" anova1.TotalCell.Mean