Extreme Optimization > User's Guide > Statistics Library > Analysis of Variance > Two-way ANOVA

Extreme Optimization User's Guide

User's Guide

Up: Analysis of Variance Next: Time Series Analysis Previous: One-way ANOVA with Repeated Measures Contents

Two-Way ANOVA

An ANOVA design with two factors is called a two-way analysis of variance. The two-way analysis of variance is implemented by the TwoWayAnovaModel class.

Constructing One-Way ANOVA Models

The TwoWayAnovaModel class has three constructors.

The first constructor takes three parameters: a NumericalVariable that specifies the dependent variable, and two CategoricalVariable objects that specify the independent variables. All three variables must have the same number of observations.

As an example, we construct an ANOVA model to measure the effect of package color and shape on sales. Our data comes from 12 stores. The two categorical variables are the shape color and the shape shape. The dependent variable is the total sales of the product in the store.

C# CopyCode imageCopy Code
string[] shapeData = {"square", "circle", "rectangle"...};
CategoricalVariable shape =  new CategoricalVariable("shape", shapeData);
  
string[] colorData = {"red", "green", "blue"...};
CategoricalVariable color =  new CategoricalVariable("color", colorData);
  
double[] salesData = {1256, 3611...};
NumericalVariable sales =    new NumericalVariable("sales", salesData);
TwoWayAnovaModel anova1 = new TwoWayAnovaModel(sales, color, shape);
Visual Basic CopyCode imageCopy Code
Dim shapeData As String() = {"square", "circle", "rectangle"...}
Dim shape As CategoricalVariable = _
    New CategoricalVariable("shape", shapeData)
Dim colorData As String() = {"red", "green", "blue"...}
Dim color As CategoricalVariable = _
    New CategoricalVariable("color", colorData)
Dim salesData As Double() = {"red", "green", "blue"...}
Dim salesAs NumericalVariable = _
    New NumericalVariable("sales", salesData)
Dim anova1 As TwoWayAnovaModel = New TwoWayAnovaModel(sales, color, shape)

The second constructor takes four parameters. The first parameter is a VariableCollection that contains the variables you wish to use in the analysis. The second parameter is the name of the dependent variable in the collection. The third and fourth parameters are the names of the independent variable in the collection. Using the variables we created in the previous example, we get:

C# CopyCode imageCopy Code
VariableCollection variables = new VariableCollection();
variables.Add(sales);
variables.Add(color);
variables.Add(shape);
TwoWayAnovaModel anova2 =    new TwoWayAnovaModel(variables, "sales", "color", "shape");
Visual Basic CopyCode imageCopy Code
Dim variables As VariableCollection = New VariableCollection()
variables.Add(sales)
variables.Add(color)
variables.Add(shape)
Dim anova2 As TwoWayAnovaModel = _
    New TwoWayAnovaModel(variables, "sales", "color", "shape")

The third constructor also takes four parameters. The first parameter is a DataTable that contains the data for the analysis. The second parameter is the name of the column that contains the dependent variable data. The third and fourth parameters are the names of the columns that contain the independent variable data.

C# CopyCode imageCopy Code
DataTable table = new DataTable();
// Fill the DataTable from some data source...
TwoWayAnovaModel anova3 =    new TwoWayAnovaModel(table, "sales", "color", "shape");
Visual Basic CopyCode imageCopy Code
Dim table As DataTable = New DataTable()
' Fill the DataTable from some data source...
Dim anova3 As TwoWayAnovaModel =    New TwoWayAnovaModel(table, "sales", "color", "shape")

All three examples produce the same ANOVA model.

Performing the analysis

The Compute method performs the actual analysis.

The results of the analysis can be obtained through the model's AnovaTable property. The ANOVA table for a two-way design has five rows. There are three rows that describe the contribution of the model to the varation. There is one row for each of the factors and one row for the interaction between the two factors. These can be retrieved through the GetModelRow method. As always, there is one row for the residuals, and one for the complete model. Index 0 gives the row for the first factor, index 1 gives the row for the second factor, and index 3 gives the row for the interaction.

The ModelRow property is not part of the ANOVA table. It shows the contribution of the complete model to the variation.

The AnovaModelRow objects obtained in this way show the results of the test for significance of the variation due to the factor compared to the variation not explained by the model. The FStatistic property gives the value of the F statistic for this ratio, while the PValue gives the actual significance of the F statistic.

The Within Groups row shows the variation of the data around the group means. It corresponds to the error or residual of the variation in the data after the model has been taken into account. The row is available through the ANOVA table's ErrorRow property.

The Total row contains the summary data for the entire data set. It can be retrieved through the TotalRow property of the ANOVA table.

The example below illustrates these properties:

C# CopyCode imageCopy Code
anova.Compute();
            
Console.WriteLine("F statistic: {0}", anova.AnovaTable.ModelRow.FStatistic);
Console.WriteLine("P-value     : {0}", anova.AnovaTable.ModelRow.PValue);
Console.WriteLine("Sum of sq. total: {0}",
    anova.AnovaTable.TotalRow.SumOfSquares);
Console.WriteLine("Sum of sq. error: {0}",
    anova.AnovaTable.ErrorRow.SumOfSquares);
Console.WriteLine("Sum of sq. color: {0}",
    anova.AnovaTable.GetModelRow(0).SumOfSquares);
Console.WriteLine("Sum of sq. shape: {0}",
    anova.AnovaTable.GetModelRow(1).SumOfSquares);
Console.WriteLine("Sum of sq. interaction: {0}",
    anova.AnovaTable.GetModelRow(2).SumOfSquares);
Console.WriteLine(anova.AnovaTable.ToString());
Visual Basic CopyCode imageCopy Code
anova.Compute();
            
Console.WriteLine("F statistic: {0}", anova.AnovaTable.ModelRow.FStatistic)
Console.WriteLine("P-value     : {0}", anova.AnovaTable.ModelRow.PValue)
Console.WriteLine("Sum of sq. total: {0}", _
    anova.AnovaTable.TotalRow.SumOfSquares)
Console.WriteLine("Sum of sq. error: {0}", _
    anova.AnovaTable.ErrorRow.SumOfSquares)
Console.WriteLine("Sum of sq. color: {0}", _
    anova.AnovaTable.GetModelRow(0).SumOfSquares)
Console.WriteLine("Sum of sq. shape: {0}", _
    anova.AnovaTable.GetModelRow(1).SumOfSquares)
Console.WriteLine("Sum of sq. interaction: {0}", _
    anova.AnovaTable.GetModelRow(2).SumOfSquares)
Console.WriteLine(anova.AnovaTable.ToString())

For the example using the packaging, we find that the F statistic for the color is 2.5049, corresponding to a p-value of 0.1619. For the shape, we find an F statistic of 7.7670 with a p-value of 0.0317, and for the interaction, we find an F statistic of 0.1359 with a p-value of 0.8755. The conclusion is that the color of the packaging does not contribute significantly to the sales of the product, but the impact of the shape is significant at the usual 0.05 level.

The group means can be accessed through the model's Cells property. In the example below, we first obtain the CategoricalScale object corresponding to the color factor. We then iterate through the levels of the scale, and print the group means for the square boxes:

C# CopyCode imageCopy Code
CategoricalScale colorFactor = anova.GetFactor(0);
foreach(object level in colorFactor.GetLevels())
    Console.WriteLine("Mean for square boxes group '{0}': {1:F4}", 
        level, anova.Cells[level, "Square"].Mean);
Visual Basic CopyCode imageCopy Code
Dim colorFactor As CategoricalScale = anova.GetFactor(0)
For Each level As Object In colorFactor.GetLevels()
    Console.WriteLine("Mean for square boxes group '{0}': {1:F4}", _
        level, anova.Cells(level, "Square").Mean)
Next

Using the special index Cell.All, we can obtain the summary statistics for all rectangular packages of all colors:

C# CopyCode imageCopy Code
Console.WriteLine("Variance of square packages: {0:F4}",
    anova.Cells[Cell.All, "square"].Variance);
Visual Basic CopyCode imageCopy Code
Console.WriteLine("Variance of square packages: {0:F4}", _
    anova.Cells[Cell.All, "square"].Variance)

The grand mean is obtained by setting both indices to Cell.All:

C# CopyCode imageCopy Code
Console.WriteLine("Grand mean: {0:F4}",
    anova.Cells[Cell.All, Cell.All].Mean);
Visual Basic CopyCode imageCopy Code
Console.WriteLine("Grand mean: {0:F4}", _
    anova.Cells[Cell.All, Cell.All].Mean)

Up: Analysis of Variance Next: Time Series Analysis Previous: One-way ANOVA with Repeated Measures Contents

Overview
Introduction
Features
Documentation
QuickStart Samples
Sample Applications
Downloads
Get it now!
Download trial version
How to Buy
Information
Resources
Contact Us
Search

"The Extreme Optimization Statistics Library for .NET is a major boon for those doing statistical work in .NET. I strongly recommend this product."
- Marc Brooks

"I have made it my mission to institutionalize the value of good API design.  I strongly believe that this is key to making developers more productive and happy on our platform. It is clear that you value good API design in your work, and take to heart developer productivity and synergy with the .NET framework."
- Brad Abrams,
Lead Program Manager, Microsoft.

This is a partial list of companies who are using our libraries:
ABB Robotics
Allstate
Applied Materials
Arcam
Astra Schedule
Babson College
Canadian Council on Learning
Canyon Associates
Caxton Associates
CECity
Constellation Energy
CreditSights
DeepOcean
Duke University
Dynamotive
Elecsoft
Engelhard Corporation
Epcor
Equipoise Software
Galileo International
GAM UK
Gammex
GlaxoSmithKline
Global Matrix
The Hartford
Infinera Corporation
Intel
JDS Uniphase
LaBranche & Co.
Learning & Skills Council
Jacobs Consultancy
Litman Gregory
Lucas Systems
Malvern Instruments
Medrio
Merck & Co.
Mintera.
Monitor Software
MorningStar
NanoString Technologies
Paletta Invent
Parametric Portfolio Associates
Prosanos
RATA Associates
RiskShield
Ramboll
Standard & Poor's
Strategic Analysis Corporation
Univ. of Alicante
Univ. of South Carolina
vielife
Xerox
US Army