Last week, we released version 6.0 of our Extreme Optimization Numerical Libraries for .NET. We’ve introduced a number of great new features. Here are some highlights.
- The linear algebra library has undergone a major overhaul:
All vector and matrix types are generic and support arbitrary element types.
- Thanks to type inference, the element type rarely needs to be specified.
- New mutability options let you create read-only vectors, and writable vectors with copy-on-write semantics.
- We’ve filled in missing operations and consistently offer them in three variants: perform the operation in-place, return a new array, or return the result in an existing array.
- The native linear algebra libraries now support Conditional Numerical Reproducibility.
The data analysis library has been greatly expanded:
- Group rows by value or quantile, sliding and expanding windows, partitions, time-based resampling and 2D pivot tables.
- Many new aggregator functions with efficient implementations for specific types of groupings like moving averages.
- Matrices and vectors can act as data frames. The same functionality that is available for data frames, like data manipulation, aggregation, and automatic missing value handling, can also be used on vectors and matrices.
- LINQ queries are supported on data frames, matrices, and vectors.
- Create data frames from databases, arbitrary .NET objects, or import and export to CSV files. Support for other file formats, including R data files and SAS files, will be released in the coming weeks and months.
- Many improvements including data transformations, lookup/join on nearest.
The statistics library has been integrated with the data analysis library:
- We’ve removed the distinction between vectors and statistical variables. Everything is now a vector.
- Covariance matrices, parameter vectors and similar objects are automatically labeled with variable names.
- Residuals, predictions, and similar vectors automatically get the observations’ labels.
- Categorical variables are automatically expanded to indicator variables as needed.
We’ve also greatly improved the experience for interactive computing. Many objects have pretty printers that return descriptions suitable for an interactive environment, for example:
- Data frames, vectors, and matrices limit the displayed rows and columns to fit the screen.
- Statistical models give a summary of parameter estimates and diagnostic information.
- Statistical tests give a brief description of the test results.
There’s more coming in that area.
Unfortunately, all this goodness did require us to make some breaking changes. Most drastic is the move to generic vectors and matrices, which caused changes in every area of the library. Fortunately, the required changes are not too extensive: just adding the generic type argument in declarations and some constructor calls takes care of most issues. The data analysis library has been moved to a different namespace, but otherwise the changes are minimal. Code using the statistics library requires considerably more changes, but we’re working to make that transition easier with a small ‘backwards compatibility’ library that we’ll be releasing soon.
The new version installs side-by-side with earlier versions, so your existing code will continue to work.
In the coming weeks, I’ll be posting in more detail about different aspects of the new version. Stay tuned!