Extreme Optimization™: Complexity made simple.

Math and Statistics
Libraries for .NET

  • Home
  • Features
    • Math Library
    • Vector and Matrix Library
    • Statistics Library
    • Performance
    • Usability
  • Documentation
    • Introduction
    • Math Library User's Guide
    • Vector and Matrix Library User's Guide
    • Data Analysis Library User's Guide
    • Statistics Library User's Guide
    • Reference
  • Resources
    • Downloads
    • QuickStart Samples
    • Sample Applications
    • Frequently Asked Questions
    • Technical Support
  • Blog
  • Order
  • Company
    • About us
    • Testimonials
    • Customers
    • Press Releases
    • Careers
    • Partners
    • Contact us
Introduction
Deployment Guide
Nuget packages
Configuration
Using Parallelism
Expand Mathematics Library User's GuideMathematics Library User's Guide
Expand Vector and Matrix Library User's GuideVector and Matrix Library User's Guide
Expand Data Analysis Library User's GuideData Analysis Library User's Guide
Expand Statistics Library User's GuideStatistics Library User's Guide
Expand Data Access Library User's GuideData Access Library User's Guide
Expand ReferenceReference
  • Extreme Optimization
    • Features
    • Solutions
    • Documentation
    • QuickStart Samples
    • Sample Applications
    • Downloads
    • Technical Support
    • Download trial
    • How to buy
    • Blog
    • Company
    • Resources
  • Documentation
    • Introduction
    • Deployment Guide
    • Nuget packages
    • Configuration
    • Using Parallelism
    • Mathematics Library User's Guide
    • Vector and Matrix Library User's Guide
    • Data Analysis Library User's Guide
    • Statistics Library User's Guide
    • Data Access Library User's Guide
    • Reference
  • Statistics Library User's Guide
    • Statistical Variables
    • Numerical Variables
    • Statistical Models
    • Regression Analysis
    • Analysis of Variance
    • Time Series Analysis
    • Multivariate Analysis
    • Continuous Distributions
    • Discrete Distributions
    • Multivariate Distributions
    • Kernel Density Estimation
    • Hypothesis Tests
    • Appendices
  • Analysis of Variance
    • ANOVA Models
    • One-Way ANOVA
    • One-Way ANOVA with Repeated Measures
    • Two-Way ANOVA
  • Two-Way ANOVA

Two-Way ANOVA

Extreme Optimization Numerical Libraries for .NET Professional

An ANOVA design with two factors is called a two-way analysis of variance. The two-way analysis of variance is implemented by the TwoWayAnovaModel class.

Constructing Two-Way ANOVA Models

The TwoWayAnovaModel class has two constructors. The first constructor takes three arguments: a VectorT that specifies the dependent variable, and two CategoricalVectorT objects that specify the independent variables. All three variables must have the same number of observations.

As an example, we construct an ANOVA model to measure the effect of package color and shape on sales. Our data comes from 12 stores. The two categorical variables are the shape color and the shape. The dependent variable is the total sales of the product in the store.

C#
VB
C++
F#
Copy
var colors = Vector.CreateCategorical(new[] {
    "Blue", "Blue", "Blue", "Blue",
    "Red", "Red", "Red", "Red",
    "Green", "Green", "Green", "Green" });
var shapes = Vector.CreateCategorical(new[] {
    "Square", "Square", "Rectangle", "Rectangle",
    "Square", "Square", "Rectangle", "Rectangle",
    "Square", "Square", "Rectangle", "Rectangle" });
var sales = Vector.Create(new[] {
    6.0, 14.0, 19.0, 17.0,
    18.0, 11.0, 20.0, 23.0,
    7.0, 11.0, 18.0, 10.0});
var anova1 = new TwoWayAnovaModel(sales, colors, shapes);
Dim colors = Vector.CreateCategorical({
    "Blue", "Blue", "Blue", "Blue",
    "Red", "Red", "Red", "Red",
    "Green", "Green", "Green", "Green"})
Dim shapes = Vector.CreateCategorical({
        "Square", "Square", "Rectangle", "Rectangle",
        "Square", "Square", "Rectangle", "Rectangle",
        "Square", "Square", "Rectangle", "Rectangle"})
Dim sales = Vector.Create(
        6.0, 14.0, 19.0, 17.0,
        18.0, 11.0, 20.0, 23.0,
        7.0, 11.0, 18.0, 10.0)
Dim anova1 = New TwoWayAnovaModel(sales, colors, shapes)

No code example is currently available or this language may not be supported.

let colors = Vector.CreateCategorical(
              [| "Blue"; "Blue"; "Blue"; "Blue";
                 "Red"; "Red"; "Red"; "Red";
                 "Green"; "Green"; "Green"; "Green" |])
let shapes = Vector.CreateCategorical(
              [| "Square"; "Square"; "Rectangle"; "Rectangle";
                 "Square"; "Square"; "Rectangle"; "Rectangle";
                 "Square"; "Square"; "Rectangle"; "Rectangle" |])
let sales = Vector.Create(
                6., 14., 19., 17., 
                18., 11., 20., 23., 
                7., 11., 18., 10.0)
let anova1 = TwoWayAnovaModel(sales, colors, shapes)

The second constructor takes four arguments. The first argument is a DataFrameR, C that contains the variables you wish to use in the analysis. The second argument is the name of the dependent variable in the data frame. The third and fourth arguments are the names of the independent variable in the data frame. Using the variables we created in the previous example, we get:

C#
VB
C++
F#
Copy
var dataFrame = DataFrame.FromColumns(
    new IVector[] { colors, shapes, sales },
    Index.Create(new[] { "color", "shape", "sales" }));
var anova2 = new TwoWayAnovaModel(dataFrame, "sales", "color", "shape");
Dim frame = DataFrame.FromColumns(
        New IVector() {colors, shapes, sales},
    Index.Create({"color", "shape", "sales"}))
Dim anova2 = New TwoWayAnovaModel(frame, "sales", "color", "shape")

No code example is currently available or this language may not be supported.

let dataFrame = 
    DataFrame.FromColumns(
        "color" => colors, "shape" => shapes, "sales" => sales)
let anova2 = TwoWayAnovaModel(dataFrame, "sales", "color", "shape")
Performing the analysis

The Compute method performs the actual calculation.

The ANOVA table

The results of the analysis can be obtained through the model's AnovaTable property. The ANOVA table for a two-way design has five rows. There are three rows that describe the contribution of the model to the variation. There is one row for each of the factors and one row for the interaction between the two factors. These can be retrieved through the GetModelRow method. As always, there is one row for the residuals, and one for the complete model. Index 0 gives the row for the first factor, index 1 gives the row for the second factor, and index 3 gives the row for the interaction.

The CompleteModelRow property is not part of the ANOVA table. It shows the contribution of the complete model to the variation.

The AnovaModelRow objects obtained in this way show the results of the test for significance of the variation due to the factor compared to the variation not explained by the model. The FStatistic property gives the value of the F statistic for this ratio, while the PValue gives the significance of the F statistic.

The Within Groups row shows the variation of the data around the group means. It corresponds to the error or residual of the variation in the data after the model has been taken into account. The row is available through the ANOVA table's TotalRow property.

The Total row contains the summary data for the entire data set. It can be retrieved through the TotalRow property of the ANOVA table.

The example below illustrates these properties:

C#
VB
C++
F#
Copy
anova1.Fit();
var anovaTable = anova1.AnovaTable;
Console.WriteLine("F statistic: {0}", anovaTable.CompleteModelRow.FStatistic);
Console.WriteLine("P-value     : {0}", anovaTable.CompleteModelRow.PValue);
Console.WriteLine("Sum of sq. total: {0}",
    anovaTable.TotalRow.SumOfSquares);
Console.WriteLine("Sum of sq. error: {0}",
    anovaTable.ErrorRow.SumOfSquares);
Console.WriteLine("Sum of sq. color: {0}",
    anovaTable.GetModelRow(0).SumOfSquares);
Console.WriteLine("Sum of sq. shape: {0}",
    anovaTable.GetModelRow(1).SumOfSquares);
Console.WriteLine("Sum of sq. interaction: {0}",
    anovaTable.GetModelRow(2).SumOfSquares);
Console.WriteLine(anovaTable.ToString());
anova1.Fit()
Dim anovaTable = anova1.AnovaTable
Console.WriteLine("F statistic: 0}", anovaTable.CompleteModelRow.FStatistic)
Console.WriteLine("P-value     : 0}", anovaTable.CompleteModelRow.PValue)
Console.WriteLine("Sum of sq. total: 0}",
    anovaTable.TotalRow.SumOfSquares)
Console.WriteLine("Sum of sq. error: 0}",
    anovaTable.ErrorRow.SumOfSquares)
Console.WriteLine("Sum of sq. color: 0}",
    anovaTable.GetModelRow(0).SumOfSquares)
Console.WriteLine("Sum of sq. shape: 0}",
    anovaTable.GetModelRow(1).SumOfSquares)
Console.WriteLine("Sum of sq. interaction: 0}",
    anovaTable.GetModelRow(2).SumOfSquares)
Console.WriteLine(anovaTable.ToString())

No code example is currently available or this language may not be supported.

anova1.Fit()
let anovaTable = anova1.AnovaTable
printfn "F statistic: %f" anovaTable.CompleteModelRow.FStatistic
printfn "P-value     : %f" anovaTable.CompleteModelRow.PValue
printfn "Sum of sq. total: %f" anovaTable.TotalRow.SumOfSquares
printfn "Sum of sq. error: %f" anovaTable.ErrorRow.SumOfSquares
printfn "Sum of sq. color: %f" (anovaTable.GetModelRow(0).SumOfSquares)
printfn "Sum of sq. shape: %f" (anovaTable.GetModelRow(1).SumOfSquares)
printfn "Sum of sq. interaction: %f" (anovaTable.GetModelRow(2).SumOfSquares)
printfn "%O" anovaTable

For the example using the packaging, we find that the F statistic for the color is 2.5049, corresponding to a p-value of 0.1619. For the shape, we find an F statistic of 7.7670 with a p-value of 0.0317, and for the interaction, we find an F statistic of 0.1359 with a p-value of 0.8755. The conclusion is that the color of the packaging does not contribute significantly to the sales of the product, but the impact of the shape is significant at the usual 0.05 level.

Type I, Type II, and Type III Sums of Squares

By default, the ANOVA table contains values computed with Type I sums of squares (also called sequential sums of squares). ANOVA calculations using other types of sums of squares. The type can be selected by setting the SumsOfSquaresType property before the model is calculated. This is a SumsOfSquaresType enumeration value.

Even if the type was not set beforehand, all 3 types are available after the calculation through three properties: TypeISumsOfSquares, TypeIISumsOfSquares, and TypeIIISumsOfSquares. These properties return an AnovaTable object with three rows, one each for the first factor, the second factor, and their interaction.

C#
VB
C++
F#
Copy
var typeIII = anova1.TypeIIISumsOfSquares;
Console.WriteLine("Type III sums of squares:");
Console.WriteLine(typeIII);
Dim typeIII = anova1.TypeIIISumsOfSquares
Console.WriteLine("Type III sums of squares:")
Console.WriteLine(typeIII)

No code example is currently available or this language may not be supported.

let typeIII = anova1.TypeIIISumsOfSquares
printfn "Type III sums of squares:"
printfn "%O" typeIII
Other properties

The group means can be accessed through the model's Cells property, which is a matrix of Cell objects. In the example below, we first obtain the CategoryIndex for the color variable. We then iterate through the levels of the index, and print the group means for the square boxes:

C#
VB
C++
F#
Copy
var colorFactor = colors.CategoryIndex;
var squareColumn = anova1.Cells.GetColumn("Square");
foreach (var level in colorFactor)
    Console.WriteLine("Mean for square boxes group '{0}': {1:F4}",
        level, squareColumn.Get(level).Mean);
Dim colorFactor = colors.CategoryIndex
Dim squareColumn = anova1.Cells.GetColumn("Square")
For Each level In colorFactor
    Console.WriteLine("Mean for square boxes group '0}': 1:F4}",
        level, squareColumn.Get(level).Mean)
Next

No code example is currently available or this language may not be supported.

let colorFactor = colors.CategoryIndex
let squareColumn = anova1.Cells.GetColumn("Square")
for level in colorFactor do
    printfn "Mean for square boxes group '%s': %.4f"
        level (squareColumn.Get(level).Mean)

The RowTotals property returns a vector of cells with totals for each row (colors in our example). The ColumnTotals property returns a vector of cells with totals for entire columns (shapes in our example). Below, we print the total variance for all rectangular packages:

C#
VB
C++
F#
Copy
Console.WriteLine("Variance of square packages: {0:F4}",
    anova1.ColumnTotals.Get("Rectangle").Variance);
Console.WriteLine("Variance of square packages: 0:F4}",
    anova1.ColumnTotals.Get("Rectangle").Variance)

No code example is currently available or this language may not be supported.

printfn "Variance of square packages: %.4f"
    (anova1.ColumnTotals.Get("Rectangle").Variance)

The TotalCell property returns the cell with totals for the complete data. The grand mean can be obtained from this cell:

C#
VB
C++
F#
Copy
Console.WriteLine("Grand mean: {0:F4}", anova1.TotalCell.Mean);
Console.WriteLine("Grand mean: 0:F4}", anova1.TotalCell.Mean)

No code example is currently available or this language may not be supported.

printfn "Grand mean: %.4f" anova1.TotalCell.Mean

Copyright (c) 2004-2021 ExoAnalytics Inc.

Send comments on this topic to support@extremeoptimization.com

Copyright © 2004-2021, Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual Studio.NET, and the Optimized for Visual Studio logo
are registered trademarks of Microsoft Corporation.