Extreme Optimization™: Complexity made simple.

Math and Statistics
Libraries for .NET

  • Home
  • Features
    • Math Library
    • Vector and Matrix Library
    • Statistics Library
    • Performance
    • Usability
  • Documentation
    • Introduction
    • Math Library User's Guide
    • Vector and Matrix Library User's Guide
    • Data Analysis Library User's Guide
    • Statistics Library User's Guide
    • Reference
  • Resources
    • Downloads
    • QuickStart Samples
    • Sample Applications
    • Frequently Asked Questions
    • Technical Support
  • Blog
  • Order
  • Company
    • About us
    • Testimonials
    • Customers
    • Press Releases
    • Careers
    • Partners
    • Contact us
Introduction
Deployment Guide
Nuget packages
Configuration
Using Parallelism
Expand Mathematics Library User's GuideMathematics Library User's Guide
Expand Vector and Matrix Library User's GuideVector and Matrix Library User's Guide
Expand Data Analysis Library User's GuideData Analysis Library User's Guide
Expand Statistics Library User's GuideStatistics Library User's Guide
Expand Data Access Library User's GuideData Access Library User's Guide
Expand ReferenceReference
  • Extreme Optimization
    • Features
    • Solutions
    • Documentation
    • QuickStart Samples
    • Sample Applications
    • Downloads
    • Technical Support
    • Download trial
    • How to buy
    • Blog
    • Company
    • Resources
  • Documentation
    • Introduction
    • Deployment Guide
    • Nuget packages
    • Configuration
    • Using Parallelism
    • Mathematics Library User's Guide
    • Vector and Matrix Library User's Guide
    • Data Analysis Library User's Guide
    • Statistics Library User's Guide
    • Data Access Library User's Guide
    • Reference
  • Statistics Library User's Guide
    • Statistical Variables
    • Numerical Variables
    • Statistical Models
    • Regression Analysis
    • Analysis of Variance
    • Time Series Analysis
    • Multivariate Analysis
    • Continuous Distributions
    • Discrete Distributions
    • Multivariate Distributions
    • Kernel Density Estimation
    • Hypothesis Tests
    • Appendices
  • Regression Analysis
    • Simple Linear Regression
    • Multiple Linear Regression
    • Ridge regression, LASSO
    • Polynomial Regression
    • Nonlinear Regression
    • Logistic Regression
    • Generalized Linear Models
  • Simple Linear Regression

Simple Linear Regression

Extreme Optimization Numerical Libraries for .NET Professional

Simple linear regression is a technique to analyze a linear relationship between two variables. A common generalization is to study relationships between two variables that can be transformed into a linear relationship, which we will call linearized. Simple linear regression is implemented by the SimpleRegressionModel class, and supports both linear and linearized regression.

Constructing Simple Regression Models

The SimpleRegressionModel class has six constructors that come in pairs. The first constructor takes two vectors and creates a linear regression model. The first vector contains the data for the dependent variable. The second vector contains the data for the independent variable. The dependent variable is named y, while the independent variable is named x.

C#
VB
C++
F#
Copy
double[] xData = {0.2, 337.4, 118.2, 884.6, 10.1,
    226.5, 666.3, 996.3, 448.6, 777.0, 558.2, 0.4, 0.6,
    775.5, 666.9, 338.0, 447.5, 11.6, 556.0, 228.1, 995.8,
    887.6, 120.2, 0.3, 0.3, 556.8, 339.1, 887.2, 999.0,
    779.0, 11.1, 118.3, 229.2, 669.1, 448.9, 0.5};
double[] yData = {0.1, 338.8, 118.1, 888.0, 9.2,
    228.1, 668.5, 998.5, 449.1, 778.9, 559.2, 0.3, 0.1,
    778.1, 668.8, 339.3, 448.9, 10.8, 557.7, 228.3, 998.0,
    888.8, 119.6, 0.3, 0.6, 557.6, 339.3, 888.0, 998.5,
    778.9,  10.2 , 117.6, 228.9, 668.4, 449.2,   0.2};
var y = Vector.Create(yData);
var x = Vector.Create(xData);
var model1 = new SimpleRegressionModel(y, x);
var model2 = new SimpleRegressionModel(yData, xData);
Dim xData = {0.2, 337.4, 118.2, 884.6, 10.1,
    226.5, 666.3, 996.3, 448.6, 777.0, 558.2, 0.4, 0.6,
    775.5, 666.9, 338.0, 447.5, 11.6, 556.0, 228.1, 995.8,
    887.6, 120.2, 0.3, 0.3, 556.8, 339.1, 887.2, 999.0,
    779.0, 11.1, 118.3, 229.2, 669.1, 448.9, 0.5}
Dim yData = {0.1, 338.8, 118.1, 888.0, 9.2,
    228.1, 668.5, 998.5, 449.1, 778.9, 559.2, 0.3, 0.1,
    778.1, 668.8, 339.3, 448.9, 10.8, 557.7, 228.3, 998.0,
    888.8, 119.6, 0.3, 0.6, 557.6, 339.3, 888.0, 998.5,
    778.9, 10.2, 117.6, 228.9, 668.4, 449.2, 0.2}
Dim y = Vector.Create(yData)
Dim x = Vector.Create(xData)
Dim model1 = New SimpleRegressionModel(y, x)
Dim model2 = New SimpleRegressionModel(yData, xData)

No code example is currently available or this language may not be supported.

let xData = [| 0.2; 337.4; 118.2; 884.6; 10.1;
    226.5; 666.3; 996.3; 448.6; 777.0; 558.2; 0.4; 0.6;
    775.5; 666.9; 338.0; 447.5; 11.6; 556.0; 228.1; 995.8;
    887.6; 120.2; 0.3; 0.3; 556.8; 339.1; 887.2; 999.0;
    779.0; 11.1; 118.3; 229.2; 669.1; 448.9; 0.5 |]
let yData = [| 0.1; 338.8; 118.1; 888.0; 9.2;
    228.1; 668.5; 998.5; 449.1; 778.9; 559.2; 0.3; 0.1;
    778.1; 668.8; 339.3; 448.9; 10.8; 557.7; 228.3; 998.0;
    888.8; 119.6; 0.3; 0.6; 557.6; 339.3; 888.0; 998.5;
    778.9;  10.2 ; 117.6; 228.9; 668.4; 449.2;   0.2 |]
let y = Vector.Create(yData)
let x = Vector.Create(xData)
let model1 = SimpleRegressionModel(y, x)
let model2 = SimpleRegressionModel(
                Vector.Create(yData), Vector.Create(xData))

Note that, because arrays can be implicitly converted to vectors, it is also possible to pass arrays instead of vectors.

The second constructor takes 3 arguments. The first argument is an IDataFrame (a DataFrameR, C or MatrixT) that contains the variables to be used in the regression. The second argument is a string containing the name of the dependent variable. The third argument is a string containing the name of the independent variable. The two names must exist in the data frame and their element type must be convertible to Double.

C#
VB
C++
F#
Copy
var dataFrame = DataFrame.FromColumns(new Dictionary<string, object>()
    { { "y", y }, {"x", x }});
var model3 = new SimpleRegressionModel(dataFrame, "y", "x");
Dim frame = DataFrame.FromColumns(New Dictionary(Of String, Object)() From
    {{"y", y}, {"x", x}})
Dim model3 = New SimpleRegressionModel(frame, "y", "x")

No code example is currently available or this language may not be supported.

let columns = Dictionary<string,obj>()
[ "x", x ; "y", y ]  |> Seq.iter columns.Add
let dataFrame = DataFrame.FromColumns<string>(columns)
let model3 = SimpleRegressionModel(dataFrame, "y", "x")

The third and fourth constructors are similar to the first two, but add a third argument that specifies the kind of relationship between the two variables. This argument is of type SimpleRegressionKind, which can take on the following values:

Value

Description

Exponential

The logarithm of the dependent variable depends linearly on the independent variable: Y = a * bX.

Inverse

The dependent variable depends linearly on the inverse (reciprocal) of the independent variable: Y = a / x + b.

Linear

The dependent variable depends linearly on the independent variable: Y = a X + b. This is the default.

Logarithmic

The dependent variable depends linearly on the logarithm of the independent variable: Y = a log X + b

Power

The logarithm of the dependent variable depends linearly on the logarithm of the independent variable: Y = a * Xb.

C#
VB
C++
F#
Copy
var model4 = new SimpleRegressionModel(y, x, 
    SimpleRegressionKind.Exponential);
var model5 = new SimpleRegressionModel(dataFrame, "y", "x", 
    SimpleRegressionKind.Exponential);
Dim model4 = New SimpleRegressionModel(y, x,
    SimpleRegressionKind.Exponential)
Dim model5 = New SimpleRegressionModel(frame, "y", "x",
    SimpleRegressionKind.Exponential)

No code example is currently available or this language may not be supported.

let model4 = SimpleRegressionModel(y, x, SimpleRegressionKind.Exponential)
let model5 = SimpleRegressionModel(dataFrame, "y", "x", 
                    SimpleRegressionKind.Exponential)

The fifth constructor is like the first but has 5 arguments in total: The third argument is a vector containing weights for the observations. The fourth argument is the kind of relationship between the variables. The fifth argument is a boolean that indicates whether the intercept (constant term) should be excluded from the model. The sixth constructor is like the second and takes the same 3 additional arguments:

C#
VB
C++
F#
Copy
var weights = Vector.Reciprocal(x);
var model6 = new SimpleRegressionModel(y, x, weights,
    kind: SimpleRegressionKind.Exponential, noIntercept: false);
Dim weights = Vector.Reciprocal(x)
Dim model6 = New SimpleRegressionModel(y, x, weights,
    SimpleRegressionKind.Exponential, False)

No code example is currently available or this language may not be supported.

let weights = Vector.Reciprocal(x)
let model6 = SimpleRegressionModel(y, x, weights,
                kind= SimpleRegressionKind.Exponential, noIntercept= false)
Computing the Regression Curve

The Compute method performs the actual analysis. Most properties and methods throw an exception when they are accessed before the Compute method is called. You can verify that the model has been calculated by inspecting the Computed property.

The GetRegressionLine method returns a Polynomial that represents the regression line. For linearized regression, the GetRegressionCurve method returns a Curve object that represents the fitted curve:

C#
VB
C++
F#
Copy
model1.Fit();
var line = model1.GetRegressionLine();
model4.Fit();
var curve = model4.GetRegressionCurve();
model1.Fit()
Dim line = model1.GetRegressionLine()
model4.Fit()
Dim curve = model4.GetRegressionCurve()

No code example is currently available or this language may not be supported.

model1.Fit()
let line = model1.GetRegressionLine()
model4.Fit()
let curve = model4.GetRegressionCurve()

The Predictions property returns a vector that contains the values of the dependent variable as predicted by the model. The Residuals property returns a vector containing the difference between the actual and the predicted values of the dependent variable. Both vectors contain one element for each observation.

Regression Parameters

The SimpleRegressionModel class' Parameters property returns a ParameterVectorT that contains the parameters of the regression model. The members of this collection are of type ParameterT. Regression parameters are created by the model. You cannot create them directly.

A simple linear regression model takes two arguments. The first, with index 0, is the intercept: the Y-value where the regression line crosses the Y-axis. The second parameter, with index 1, is the slope of the regression line. For linearized

The Parameter class has four useful properties. The Value property returns the numerical value of the parameter, while the StandardError property returns the standard deviation of the parameter's distribution.

The Statistic property returns the value of the t-statistic corresponding to the hypothesis that the parameter equals zero. The PValue property returns the corresponding p-value. A high p-value indicates that the variable associated with the parameter does not make a significant contribution to explaining the data. The p-value always corresponds to a two-tailed test. The following example prints the properties of the slope parameter of our earlier example:

C#
VB
C++
F#
Copy
var slope = model1.Parameters[1];
Console.WriteLine("Name:        {0}", slope.Name);
Console.WriteLine("Value:       {0}", slope.Value);
Console.WriteLine("St.Err.:     {0}", slope.StandardError);
Console.WriteLine("t-statistic: {0}", slope.Statistic);
Console.WriteLine("p-value:     {0}", slope.PValue);
Dim slope = model1.Parameters(1)
Console.WriteLine("Name:        {0}", slope.Name)
Console.WriteLine("Value:       {0}", slope.Value)
Console.WriteLine("St.Err.:     {0}", slope.StandardError)
Console.WriteLine("t-statistic: {0}", slope.Statistic)
Console.WriteLine("p-value:     {0}", slope.PValue)

No code example is currently available or this language may not be supported.

let slope = model1.Parameters.[1]
Console.WriteLine("Name:        0}", slope.Name)
Console.WriteLine("Value:       0}", slope.Value)
Console.WriteLine("St.Err.:     0}", slope.StandardError)
Console.WriteLine("t-statistic: 0}", slope.Statistic)
Console.WriteLine("p-value:     0}", slope.PValue)

The Parameter class has one method: GetConfidenceInterval. This method takes one argument: a confidence level between 0 and 1. A value of 0.95 corresponds to a confidence level of 95%. The method returns the confidence interval for the parameter at the specified confidence level as an Interval structure.

Verifying the Quality of the Regression Line

The ResidualSumOfSquares property gives the sum of the squares of the residuals. The regression line was found by minimizing this value. The StandardError property gives the standard deviation of the data.

The RSquared property returns the coefficient of determination. It is the ratio of the variation in the data that is explained by the model compared to the total variation in the data. Its value is always between 0 and 1, where 0 means the model explains nothing and 1 means the model explains the data perfectly.

An entirely different assessment is available through an analysis of variance. Here, the variation in the data is decomposed into a component explained by the model, and the variation in the residuals. The FStatistic property returns the F-statistic for the ratio of these two variances. The PValue property returns the corresponding p-value. A low p-value means that it is unlikely that the variation in the model is the same as the variation in the residuals. This means that the model is significant.

The results of the analysis of variance are also summarized in the regression model's ANOVA table, returned by the AnovaTable property.

Copyright (c) 2004-2021 ExoAnalytics Inc.

Send comments on this topic to support@extremeoptimization.com

Copyright © 2004-2021, Extreme Optimization. All rights reserved.
Extreme Optimization, Complexity made simple, M#, and M Sharp are trademarks of ExoAnalytics Inc.
Microsoft, Visual C#, Visual Basic, Visual Studio, Visual Studio.NET, and the Optimized for Visual Studio logo
are registered trademarks of Microsoft Corporation.