Indexing structures for files Flashcards
What is the importance of indexing
Indexes, or access structures, are used to speed up access to desired data. – E.g., author catalog in library
Define a search key
attribute or set of attributes used to look up records in a file.
Define and illustrate an index file
The index file is a table of <key, block_pointer> pairs.
* These are called the index entries and recap the ordering key of the first record of their pointed-to block.
*See page 5 for illustration
Index files are typically much smaller than the original file
How are indexes stored
They’re stored on disk just like the main data file, and provide other access paths that can be quicker for certain operations on certain fields.
The index file usually occupies considerably less disk blocks than the data file because its entries are much smaller
How are indexes created
Any field (or combination of fields) can be used to create an index, but there will be different index types depending on whether the field is a key
(unique), and whether the main file is sorted by it or not.
* There can be multiple indexes on one file.
Give one advantage and disadvantage of indexes
Indexes speed up access on the indexed field, but slow down updates— almost every update on the main table must also update every index.
Describe dense and sparse indexes
- A dense index has an index entry for every search key value (and hence every record) in the data file.
- A sparse (or nondense) index, on the other hand, has index entries for only some of the search values
Illustrate dense indexing
Index records contain search key value and a pointer to the actual record on the disk.
*See page 8 for illustration
What is an advantage and disadvantage of dense indexing
There is an index record for every search key value
in the database.
* This makes searching faster but requires more space to store index records itself
Illustrate sparse indexing
It is an index record that appears for only some of the values in the file.
*See page 9
What are the Types of Single-Level Ordered Indexes
Primary Indexes
Clustering Index
Secondary Index
What is an anchor record
The first record of each block
How is a record retrieved using primary indexes
To retrieve a record given its ordering key value using the index, the system does a binary search in the index file to find the index entry whose key value is ≤ the goal key’s value, then retrieves the pointed-to block from the original file
What is a primary index
An index on the ordering key (often primary key) of a sorted file.
Do the example on page 12-14
What are the problems with primary indexing
i. Inserting and deleting records in the main file must move other records, since it’s ordered.
ii. Some insertions/deletions must also change index entries, if the anchor records change.
iii. There can be only one primary index on a file.
Describe clustering indexing
It is a non-dense index
Defined on an ordered data file
Includes one index entry for each distinct value of the field; the index entry points to the first data block that contains records with that field value.
*See page 17 and 18 for example
What is the difference between clustering indexing and primary indexing
In clustering indexing the data file is ordered on a non-key field unlike primary index, which requires that the ordering field of the data file have a distinct value for each record.
Describe a secondary index
Includes one entry for each record in the data file; hence, it is a dense index
A secondary index provides a secondary means of accessing a file for which some primary access already exists.
The secondary index may be on a field which is a candidate key and has a unique value in every record, or a nonkey with duplicate values.
*See page 20
Describe the two fields a secondary index may have
i. The first field is of the same data type as some non-ordering field of the data file that is an indexing field.
ii. The second field is either a block pointer or a record pointer. There can be many secondary indexes (and hence, indexing fields) for the same file.
Provide a summary of single indexes highlighting the number of index entries, whether it is dense or non-dense and whether it uses block anchoring
*See page 21
Do the questions in the file called multiindexing exercise in downloads