When querying for a specific record, the database engine needs to perform a table scan, comparable to a reader trying to find a specific word in a book by thumbing through its pages. This process can be accelerated by using indexes: just like a book's index helps the reader to find the page with the desired word more quickly, a database index effects a great acceleration in speed for the retrieval of records. This becomes especially helpful, where large tables with millions of records like in the MultiDark database are involved.
We created indexes for accelerating Spatial Queries and a primary key on the Object Identifiers for the main tables in our database.
Primary keys have two main properties: 1) the column on which the key is defined must be unique, i.e. duplicate entries are not allowed and 2) they are clustered indexes. Clustered indexes (see illustration below) have the advantage of not only giving a reference to the record in the table, but they contain the information of the complete entry (the complete table row) itself, which saves a lot of look-up time. Like an index compares to a book's index with page look-up, the clustered index can be compared to a telephone book, which contains already all the information at one place without further reference.
|A (non-clustered) database index contains pointers to the data in a presorted way,
mostly ordered by one column for which the index is defined (grey column).
|A clustered index contains the complete
records (rows) in the sorted order.
Once the index is defined, the database server itself will use it internally for accelerating the user queries. For the user, it is only important to know which indexes exist, so that he/she can rewrite the query using indexed columns, if possible.
A list of indexes which exist for a database table (e.g.
BDM table in database miniMDR1) can be retrieved with the following query:
sp_helpindexand returns a table with the index name, a description and the columns ("index_keys") on which the index was created. Queries using these columns in the given order (e.g. by first constraining the
whereclause and only then constraining the
mass) will be faster than those queries with limits on other columns or using reverse ordering. Using a different order is comparable to looking through the telephone book, searching for people whose first name is "Tom" without specifying a last name first.