Explain about various indices in Teradata?

Primary Index

Primary index determines the distribution of table rows on the disks controlled by AMPs.
In Teradata RDBMS, a primary index is required for row distribution and storage.
When a new row is inserted, its hash code is derived by applying a hashing algorithm to the value in the column(s)
of the primary code (as show in the following figure). Rows having the same primary index value are stored on the same AMP.

Secondary Index

In addition to a primary index, up to 32 unique and non-unique secondary indexes can be defined for a table.
Comparing to primary indexes, Secondary indexes allow access to information in a table by alternate, less frequently used paths.
A secondary index is a subtable that is stored in all AMPs, but separately from the primary table.
The subtables, which are built and maintained by the system, contain the following;
RowIDs of the subtable rows
Base table index column values
RowIDs of the base table rows (points)

Join Index

A join index is an indexing structure containing columns from multiple tables, specifically the
resulting columns form one or more tables. Rather than having to join individual tables each time the join operation
is needed, the query can be resolved via a join index and, in most cases, dramatically improve performance.

Partitioned Primary Index (PPI)

Partitioned Primary Index(PPI):
It is one of the unique features of Teradata, which allows access of portion of data of large table.
This reduces the overhead of scanning the complete table thus improving performance.
PPI works by hashing rows to different virtual AMPs, as is done with a normal PI.
PPI does not alter data distribution, it only creates partitions on data already distributed based on PI.
Usually PPI's are defined on a table in order to increase query efficiency by avoiding full table scans without
the overhead and maintenance costs of secondary indexes.

Post a Comment