New Version 6.0!

Try it for free with our fully functional 60-day trial version.

Download now!

QuickStart Samples

Principal Component Analysis (PCA) QuickStart Sample (F#)

Illustrates how to perform a Principal Components Analysis using classes in the Extreme.Statistics.Multivariate namespace in F#.

C# code Visual Basic code IronPython code Back to QuickStart Samples


open System

open Extreme.Mathematics
open Extreme.Mathematics.LinearAlgebra.IO
open Extreme.Statistics
open Extreme.Statistics.Multivariate

// Demonstrates how to use classes that implement
// Principal Component Analysis (PCA).

// This QuickStart Sample demonstrates how to perform
// a principal component analysis on a set of data.
// The classes used in this sample reside in the
// Extreme.Statistics.Multivariate namespace..

// First, our dataset, 'depress.txt', which is from
//     Computer-Aided Multivariate Analysis, 4th Edition
//     by A. A. Afifi, V. Clark and S. May, chapter 16
//     See

// The data is in delimited text format. Use a matrix reader to load it into a matrix.
let m = 
    use reader = new DelimitedTextMatrixReader(@"..\..\..\..\Data\Depress.txt")
    reader.MergeConsecutiveDelimiters <- true
    reader.SetColumnDelimiters(' ')
    let m = reader.ReadMatrix()
    // The data we want is in columns 8 through 27:
    m.GetSubmatrix(0, m.RowCount - 1, 8, 27)

// Principal component analysis

// We can construct PCA objects in many ways. Since we have the data in a matrix,
// we use the constructor that takes a matrix as input.
let pca = PrincipalComponentAnalysis(m)
// and immediately perform the analysis:

// We can get the contributions of each component:
printfn " #    Eigenvalue Difference Contribution Contrib. %%"
for i in 0..4 do
    // We get the ith component from the model...
    let componenti = pca.Components.[i]
    // and write out its properties
    printfn "%2d%12.4f%11.4f%14.3f%%%10.3f%%"
        i componenti.Eigenvalue componenti.EigenvalueDifference
        (100.0 * componenti.ProportionOfVariance)
        (100.0 * componenti.CumulativeProportionOfVariance)

// To get the proportions for all components, use the
// properties of the PCA object:
let proportions = pca.VarianceProportions

// To get the number of components that explain a given proportion
// of the variation, use the GetVarianceThreshold method:
let count = pca.GetVarianceThreshold(0.9)
printfn "Components needed to explain 90%% of variation: %d" count
printfn ""

// The value property gives the components themselves:
printfn "Components:"
printfn "Var.      1       2       3       4       5"
let pcs = pca.Components
for i in 0..pcs.Count-1 do
    printfn "%4d%8.4f%8.4f%8.4f%8.4f%8.4f" i
        pcs.[0].Value.[i] pcs.[1].Value.[i] pcs.[2].Value.[i] 
        pcs.[3].Value.[i] pcs.[4].Value.[i]
printfn ""

// The scores are the coefficients of the observations expressed as a combination
// of principal components.
let scores = pca.ScoreMatrix

// To get the predicted observations based on a specified number of components,
// use the GetPredictions method.
let prediction = pca.GetPredictions(count)
printfn "Predictions using %d components:" count
printfn "   Pr. 1  Act. 1   Pr. 2  Act. 2   Pr. 3  Act. 3   Pr. 4  Act. 4"
for i in 0..9 do
    printfn "%8.4f%8.4f%8.4f%8.4f%8.4f%8.4f%8.4f%8.4f"
        (prediction.[i, 0]) m.[i, 0]
        (prediction.[i, 1]) m.[i, 1]
        (prediction.[i, 2]) m.[i, 2]
        (prediction.[i, 3]) m.[i, 3]

printf "Press any key to exit."
Console.ReadLine() |> ignore