Extreme Optimization > User's Guide > Statistics Library > Analysis of Variance > ANOVA Models

Extreme Optimization User's Guide

User's Guide

Up: Analysis of Variance Next: One-way ANOVA Previous: Analysis of Variance Contents

ANOVA Models

The label "analysis of variance" (ANOVA) brings together a series of techniques to determine and measure the source of the variation in data. Specifically, ANOVA procedures partition the total variation in a data set into its component parts.

ANOVA models come in many shapes and sizes, called designs. The Extreme Optimization Numerical Libraries for .NET supports the three most common designs: one-way, one-way with repeated measures, and two-way analysis of variance. However, the infrastructure is in place to handle designs of any size and complexity.

Defining ANOVA models

All classes that implement ANOVA models inherit from a common base class, AnovaModel, which in turn inherits from GeneralLinearModel, the base class of all statistical model classes.

In regression models, the dependent variable is a linear function of the independent variables. In an ANOVA design, the independent variables are categorical. The contribution of each individual combination of values of the independent variables must be estimated separately. Some dependencies exist, so the actual number of parameters is smaller than the number of combinations. Depending on the design, some combinations may be excluded from the model, further decreasing the number of parameters.

The set of all possible values of a categorical variable is called a factor. The possible values are called the levels of the factor. The purpose of an ANOVA analysis is to investigate the contribution of each level of each factor, and/or combinations thereof to the total variation of the data.

So even though the model is initially defined in terms of the dependent and independent variables, the actual calculations are performed using the factors rather than the independent variables they are associated with.

The GetFactor method of the AnovaModel class returns the Factor object at the specified index. An overload allows you to retrieve the factor associated with an independent variable through the variable's name.

Cells and Cell Arrays

The first step in performing an analysis of variance is to divide the data set into groups of rows with the same values for the factors. The data that is associated with a particular combination of factor levels is called a cell.

Cells are implemented by the Cell class. This class has a number of properties that return summary statistics for the data in the cell. The most important ones are: Count, which returns the number of observations in the cell, Mean which returns the cell mean, and Variance which returns the variance of the data in the cell only.

Cell objects can't be created directly. Instead, they are accessed through the model's cell array. Each ANOVA model has an associated CellArray, a multi-dimensional array of Cell objects. This array is accessible through the Cells property of the AnovaModel object. The cell array has as many dimensions as there are factors in the model.

To access a specific cell, use the factor levels as indices. Using the special index Cell.All for a factor level indicates that the cell contains the totals for all levels of the factor. Setting all indices to Cell.All indicates that the cell represents summary data for the entire data set.

A CellArray has a number of useful properties. The IsBalanced property indicates whether all the cells in the model have the same number of observations. Most ANOVA models in the Extreme Optimization Numerical Libraries for .NET require that the data be balanced in this way. The ObservationsPerCell property returns the number of observations in each cell for a balanced design. If the design is unbalanced, the value -1 is returned. Finally, the Length property returns the total number of cells in the array.

Results of the Analysis

The results of an analysis of variance are in the same format as those of other statistical models.

The AnovaTable property returns the AnovaTable object that summarizes the results. The number of rows in the table varies with the details of the design. The TotalRow property always returns the AnovaRow for the complete data. The ErrorRow property returns the row for the residuals. The CompleteModelRow property returns the row for all the factors or interactions in the model combined. Rows corresponding to the individual factors and interactions in the model can be retrieved through the GetModelRow  method.

Up: Analysis of Variance Next: One-way ANOVA Previous: Analysis of Variance Contents

Overview
Introduction
Features
Documentation
QuickStart Samples
Sample Applications
Downloads
Get it now!
Download trial version
How to Buy
Information
Resources
Contact Us
Search

"The Extreme Optimization Statistics Library for .NET is a major boon for those doing statistical work in .NET. I strongly recommend this product."
- Marc Brooks

"I have made it my mission to institutionalize the value of good API design.  I strongly believe that this is key to making developers more productive and happy on our platform. It is clear that you value good API design in your work, and take to heart developer productivity and synergy with the .NET framework."
- Brad Abrams,
Lead Program Manager, Microsoft.

This is a partial list of companies who are using our libraries:
ABB Robotics
Allstate
Applied Materials
Arcam
Astra Schedule
Babson College
Canadian Council on Learning
Canyon Associates
Caxton Associates
CECity
Constellation Energy
CreditSights
DeepOcean
Duke University
Dynamotive
Elecsoft
Engelhard Corporation
Epcor
Equipoise Software
Galileo International
GAM UK
Gammex
GlaxoSmithKline
Global Matrix
The Hartford
Infinera Corporation
Intel
JDS Uniphase
LaBranche & Co.
Learning & Skills Council
Jacobs Consultancy
Litman Gregory
Lucas Systems
Malvern Instruments
Medrio
Merck & Co.
Mintera.
Monitor Software
MorningStar
NanoString Technologies
Paletta Invent
Parametric Portfolio Associates
Prosanos
RATA Associates
RiskShield
Ramboll
Standard & Poor's
Strategic Analysis Corporation
Univ. of Alicante
Univ. of South Carolina
vielife
Xerox
US Army