An index is a collection of keys that is used
to label the rows and columns of a data frame
or matrix, or the elements of a vector.
Once an index has been assigned to a dimension,
it will propagate through calculations.
For example, applying a mathematical function to a vector with an index will
return a vector with the same index.
One particular feature that makes indexes very useful is automatic alignment.
Any calculation that involves two or more operands that have an index
will be aligned on their index. By default, an outer join is performed.
If a key in one index does not appear in the other index, a missing value is returned.
For example, given two vectors:
Vector A: Vector B:
a: 1 b: 20
b: 2 e: 50
c: 3 c: 30
d: 4 a: 10
their sum would be
A + B:
a: 11
b: 22
c: 33
d: -
e: -
Indexes are used to label the rows and columns of both data frames and matrices.
There are a couple of important differences.
A data frame must have a row index and a column index.
Moreover, the types of the keys must be known statically at compile time.
A matrix may have a row index or a column index or both,
but they are optional.
Furthermore, the type of the keys need not be known and may even change at runtime.
Indexes are also used to enumerate the categories in a categorical variable,
and to label the bins in a histogram.
Two types are used to represent indexes.
The IndexT
class represents an index where the generic type argument
specifies the type of the keys.
This type is used to specify the indexes of a data frame.
The IIndex
interface represents an index where the type of the keys is not specified.
This type is used to specify the indexes of matrices and vectors.
In addition, the Index
class contains static methods for creating and manipulating indexes.
Indexes are created by calling one of the methods of the static
Index class.
The Create method
takes a sequence of key values as its only argument and constructs an index
containing these values. The order of the values is preserved.
var index = Index.Create(new[] { "a", "b", "c", "d" });
var a = Vector.Create(new double[] { 1.0, 2.0, 3.0, 4.0 });
a.Index = index;
Console.WriteLine(a);
Dim idx = Index.Create({"a", "b", "c", "d"})
Dim a = Vector.Create({1.0, 2.0, 3.0, 4.0})
a.Index = idx
Console.WriteLine(a)
No code example is currently available or this language may not be supported.
let index = Index.Create([| "a"; "b"; "c"; "d" |])
let a = Vector.Create([| 1.0; 2.0; 3.0; 4.0 |])
a.Index <- index
printfn "%O" a
The Default method
creates an index of row numbers. This index can be used when
no suitable data is available to serve as an index, or when
the row index is not specified.
This method has two overloads. The first overload takes one argument:
the total length of the index. The keys will run from 0 to one less than the length.
The second overload takes two arguments: the first argument is the first row number (key value),
and the second argument is the (exclusive) last row number. The keys will run from
the first row number to one less than the last row number.
var numbers = Index.Default(10);
var numbers2 = Index.Default(10, 20);
Dim numbers = index.Default(10)
Dim numbers2 = index.Default(10, 20)
No code example is currently available or this language may not be supported.
let numbers = Index.Default(10)
let numbers2 = Index.Default(10, 20)
The CreateDateRange
method constructs an index of date values. It has five overloads. The simplest
overload takes two arguments: the first argument is the start date.
The second argument is the number of dates in the index:
var dateIndex = Index.CreateDateRange(new DateTime(2015, 4, 25), 10);
Dim dateIndex = index.CreateDateRange(New DateTime(2015, 4, 25), 10)
No code example is currently available or this language may not be supported.
let dateIndex = Index.CreateDateRange(new DateTime(2015, 4, 25), 10)
There is one more special kind of index: an interval index.
This is an index whose keys consist of contiguous intervals.
For example, say we want to classify persons by age group.
We can create an interval index by passing an array of bin boundaries
to the
CreateBins
method. An optional argument, of type
SpecialBins,
allows us to specify how to handle values that are outside the supplied boundaries.
In the example below, a special bin is created for all values over 65:
int[] ages = { 0, 18, 35, 65 };
var ageGroups = Index.CreateBins(ages, SpecialBins.AboveMaximum);
Dim ages() As Integer = {0, 18, 35, 65}
Dim ageGroups = Index.CreateBins(ages, SpecialBins.AboveMaximum)
No code example is currently available or this language may not be supported.
let ages = [| 0; 18; 35; 65 |]
let ageGroups = Index.CreateBins(ages, SpecialBins.AboveMaximum)
Interval indexes are used for binning operations and for creating histograms.
See the section on Histograms
for more details.
Many other operations also create indexes.
For example, when you import a data frame from a text file, you can specify
which column should act as the row index and the index will be created automatically.
Several structural transformation operations like grouping, pivoting, and stacking
result in new indexes being created to reflect the new structure.
Indexes can also be created by getting slices or multiple keys from the index.
Indexes are read-only. The indexer property takes an integer and returns
the key at the specified position.
An optional second argument specifies the level of the key value to return.
This is only meaningful for hierarchical indexes,
where the standard indexer returns the key as a tuple.
A third indexer takes a list of integers and returns an index that contains
only the keys at those positions, preserving the order.
The GetSlice(Int32, Int32, Int32)
method returns a new index that contains the key values between the specified
start and end positions at the specified interval.
Several properties give more information about the index.
The number of keys in the index is given by
Length.
The IsSorted
property indicates whether the keys are sorted.
This property returns true if the keys are sorted in ascending or in descending order.
The IsUnique
property indicates whether every key appears only once in the index.
var length = index.Length;
var i2 = index[2];
var indexes = new int[] { 2, 1 };
var subIndex = index[indexes];
var sorted = index.IsSorted;
var index2 = Index.Create(new[] { "a", "c", "b", "d" });
var sorted2 = index.IsSorted;
var unique = index.IsUnique;
Dim length = idx.Length
Dim i2 = idx(2)
Dim indexes = New Integer() {2, 1}
Dim subIndex = idx(indexes)
Dim sorted = idx.IsSorted
Dim idx2 = Index.Create({"a", "c", "b", "d"})
Dim sorted2 = idx2.IsSorted
Dim unique = idx2.IsUnique
No code example is currently available or this language may not be supported.
let length = index.Length
let i2 = index.[2]
let indexes = [| 2; 1 |]
let subIndex = index.[indexes]
let sorted = index.IsSorted
let index2 = Index.Create([| "a"; "c"; "b"; "d" |])
let sorted2 = index.IsSorted
let unique = index.IsUnique
One of the primary functions of an index is to map a key to its ordinal position.
The Lookup
method takes a key value and returns the position.
If the key was not found, -1 is returned.
This method has an overload that takes a sequence of keys
and returns an array of indexes.
In some cases, the TryLookup
method may be more convenient.
This method returns whether the key was found in the index.
It takes an out argument that, on return, contains the position.
If the key was not found, the value of this out argument is undefined.
In the example below, we find the position of the key c in the index
we created earlier. We then try to find the key e, which will fail:
var position = index.Lookup("c");
if (index.TryLookup("e", out position))
Console.WriteLine("We shouldn't be here.");
Dim position = idx.Lookup("c")
If (idx.TryLookup("e", position)) Then
Console.WriteLine("We shouldn't be here.")
End If
No code example is currently available or this language may not be supported.
let position = index.Lookup("c")
if (fst (index.TryLookup("e"))) then
failwith "We shouldn't be here."
When the index is sorted, it is possible to look up the key
nearest to the provided value. This is useful, for example,
when you want the value nearest a specific time in a time series.
The LookupNearest method
performs this operation. The second argument is a
Direction value
that specifies where to look for the nearest key. When the provided key
is found in the index, its position is returned. If the key was not found,
the position of the next key in the specified direction is returned,
if it exists. If a sequence of keys is provided, an integer array
containing the positions of the nearest keys is returned.
There is also a TryLookupNearest
which works exactly like TryLookup.
To illustrate how lookup nearest works, we first create an index
of 10 dates starting 5 days before today.
We then try to lookup the current time DateTime.Now.
This will fail because the dates in the index are all at midnight.
We then use the
LookupNearest
method to find the nearest key. First, we go backward, and we find today's date.
Then we go forward and find the next date:
var dates = Index.CreateDateRange(DateTime.Today.AddDays(-5), 10);
var now = DateTime.Now;
if (!dates.TryLookup(now, out position))
Console.WriteLine("Exact lookup failed.");
position = dates.LookupNearest(now, Direction.Backward);
Dim dates = Index.CreateDateRange(DateTime.Today.AddDays(-5), 10)
Dim now = DateTime.Now
If (Not dates.TryLookup(now, position)) Then
Console.WriteLine("Exact lookup failed.")
End If
position = dates.LookupNearest(now, Direction.Backward)
position = dates.LookupNearest(now, Direction.Backward)
No code example is currently available or this language may not be supported.
let dates = Index.CreateDateRange(DateTime.Today.AddDays(-5.0), 10)
let now = DateTime.Now
if (fst(dates.TryLookup(now))) then
printfn "Exact lookup failed."
let mutable position = 0
position <- dates.LookupNearest(now, Direction.Backward)
position <- dates.LookupNearest(now, Direction.Backward)
It is possible to add and remove keys from an index. This always returns
a new index. The Append(T)
method appends a single key at the end of the index and returns the new index.
It has an overload that takes another index as its first argument and appends
the entire index at the end. A second parameter is a boolean value that specifies
whether to verify if all the keys in the result are unique.
The Remove
and RemoveAt methods
remove a key by value and by position, respectively. The code below illustrates these methods.
var index3 = Index.Create(new[] { "a", "b", "c", "d" });
var index4 = index3.Append("e");
var index5 = Index.Create(new[] { "f", "g" });
var index6 = index3.Append(index5, true);
var index7 = index3.Remove("b");
var index8 = index3.RemoveAt(1);
Dim index3 = Index.Create({"a", "b", "c", "d"})
Dim index4 = index3.Append("e")
Dim index5 = Index.Create({"f", "g"})
Dim index6 = index3.Append(index5, True)
Dim index7 = index3.Remove("b")
Dim index8 = index3.RemoveAt(1)
No code example is currently available or this language may not be supported.
let index3 = Index.Create([| "a"; "b"; "c"; "d" |])
let index4 = index3.Append("e")
let index5 = Index.Create([| "f"; "g" |])
let index6 = index3.Append(index5, true)
let index7 = index3.Remove("b")
let index8 = index3.RemoveAt(1)
Indexes can also be created from other indexes.
The Permute
method applies the specified permutation to the index and returns the result.
The UnionT and
IntersectT methods
return an index that contains all keys that appear in at least one or both indexes.
var permutation = new Permutation(new[] { 1, 2, 3, 0 });
var index9 = index3.Permute(permutation);
var index10 = Index.Create(new string[] { "a", "c", "d" });
var index11 = Index.Create(new string[] { "d", "a", "b", "e" });
var index12 = Index.Intersect(index10, index11);
var index13 = Index.Union(index10, index11);
Dim permutation = New Permutation({1, 2, 3, 0})
Dim index9 = index3.Permute(permutation)
Dim index10 = Index.Create(New String() {"a", "c", "d"})
Dim index11 = Index.Create(New String() {"d", "a", "b", "e"})
Dim index12 = Index.Intersect(index10, index11)
Dim index13 = Index.Union(index10, index11)
No code example is currently available or this language may not be supported.
let permutation = new Permutation([| 1; 2; 3; 0 |])
let index9 = index3.Permute(permutation)
let index10 = Index.Create([| "a"; "c"; "d" |])
let index11 = Index.Create([| "d"; "a"; "b"; "e" |])
let index12 = Index.Intersect(index10, index11)
let index13 = Index.Union(index10, index11)
Hierarchical indexes are a convenient way to represent higher-dimensional data.
The keys in a hierarchical index are tuples. For example, a two-level
index will have keys of type TupleT1, T2.
Storage of the keys is optimized to enable fast lookup and join operations.
Creating hierarchical indexes
Two-level hierarchical indexes are created using the
Create method.
This method takes two arguments: a list containing the keys for the first and second level,
respectively. A second method, CreateGrouped
creates an index that is grouped by the first level.
All entries with the same value for the first level will be contiguous.
This method takes an additional out argument: a permutation from the original order of the entries
to the grouped order.
var hIndex = Index.Create(
new string[] { "One", "Two", "One", "Two" },
new string[] { "a", "b", "a", "b" });
Console.WriteLine("hIndex[1,1] = {0}", hIndex[1, 1]);
a.Index = hIndex;
Console.WriteLine(a);
Dim hIndex = Index.Create(
{"One", "Two", "One", "Two"},
{"a", "b", "a", "b"})
Console.WriteLine("hIndex(1,1) = 0}", hIndex(1, 1))
a.Index = hIndex
Console.WriteLine(a)
No code example is currently available or this language may not be supported.
let hIndex = Index.Create(
[| "One"; "Two"; "One"; "Two" |],
[| "a"; "b"; "a"; "b" |])
printfn "hIndex.[1,1] = %O" hIndex.[1, 1]
a.Index <- hIndex
Console.WriteLine(a)
Three-level hierarchical indexes are similarly created using the
Create method.
This method takes three arguments: a list containing the keys
for the first, second, and third level, respectively.
A second method, CreateGrouped
creates an index that is grouped by the first two levels.
This method takes an additional out argument: a permutation from the original order
of the entries to the grouped order.
Operations on hierarchical indexes
Several other operations are available to aid in creating hierarchical indexes.
The NestU
method returns a new index with one additional level. It takes one argument
that is also an index. This index is repeated for every key in the original index.
Each key consists of the key of the original index combined with a key from the
argument.