The DataExtensions
class contains extension methods for reading data from a database
into a data frame, and for converting between data tables and data frames.
Reading data frames from databases
The ReadDataFrame
reads a data frame from an IDataReader.
It has four overloads. Since this is an extension method, the first argument
is always the IDataReader
to read from.
For the first overload, the second argument is optional.
It is an integer that specifies the initial capacity
of the data frame. For optimal performance, this value should equal the number of
rows in the data frame.
The second overload has three arguments in all. Here, the second argument is
a sequence of column names. Only the columns in the specified sequence will be
loaded into the data frame. The third argument is once again the initial capacity
of the data frame.
The third and fourth overloads take a type argument that specifies the type
of the row index of the data frame. The second argument (after the
IDataReader)
is the name of the column that contains the row index.
This is followed by (optionally) the sequence of column names to load into
the data frame and the initial capacity. The last argument is a boolean value
that specifies whether the key column should be dropped from the data frame.
THe default is .
Converting between data tables and data frames
The ToDataFrame
method converts a DataTable
to a data frame.
The method is once again defined as an extension method. The first argument
is always the data table to convert.
The overloads of this method parallel those for reading data frames.
The first overload takes no additional arguments. It simply converts all columns
in the data table to columns in a new data frame.
The second overload takes one additional argument: a sequence of strings
that specify the columns to include in the data frame.
The third and fourth overload let you specify the column to use as
the row index. These overloads take the type of the row keys
as a generic type argument. The second actual argument for both
overloads is the name of the key column.
This can optionally be followed by a sequence of column names to include in the data frame.
The ToDataTable
method performs the conversion in the other direction. It converts a data frame
to a data table. This is also an extension method, with the first argument
of type IDataFrame.
This means that the method works equally well for data frames, matrices, and
vectors.
The method has four overloads that parallel the overloads of the
ToDataFrame method.
The first overload takes no additional arguments. It simply converts all columns
in the data frame to columns in a new data table.
The second overload takes one additional argument: a sequence of column keys
that specify the columns to include in the data table.
This overload takes the type of the column keys as a generic type argument,
which can usually be inferred.
The third and fourth overload let you specify a column name for the row index
of the data frame. The second actual argument for both
overloads is the name of the key column.
This can optionally be followed by a sequence of column names to include in the data frame,
in which case the method also takes the type of the column keys of the data frame
as a generic type argument.