Indexing Flashcards
What were the older methods used ?
we need a more efficient way of using indexes to retrieve data.
Hash: uses hash function of a set of hash fields. Allows direct access if hash fields are known.
Heap: No specific structure, uses linear search.
What is an index ?
It is a data structure that allows DBMS to locate particular records in a file in less time.
Faster response to user queries
Can speed up retrieval of records if requirements are on the condition that it to be met.
What is an index access ?
Is associated with particular search key
and has records consisting of a key value and address of logical record in file CONTAINING key value.
VALUES in index are usually sorted/ordered acc to indexing field.
When index is ordered we ca perform BINARY SEARCH.
we can have more than one index field.
What are the two types of files ?
DATA:contains logical records
INDEX: Contains index records
What are the characteristics of indexes ?
Access types: the methods that can be supported efficiently.
Access time: the time needed to locate result set.
Insertion/deletion efficiency: how fast it can be done.
Storage overhead: additional storage required in index structure.
What is primary index ?
If data is sequentially ordered and indexing field is a KEY FIELD TO THE FILE (unique).
Built for a data file sorted on its key field.
Index file is a sorted file whose records are fixed in length and has two fields:
First field is same data type as ORDEREING KEY FIElD -PRIMARY KEY
Second field is a pointer to a disk block.
One entry for each block in the data file.
First record in each block is called ANCHOR RECORD.
SPARSE
How is the performance of the primary index ?
Requires significantly FEWER BLOCKS , smaller in size that data field record
Binary search on index file requires fewer block accesses than data file.
Insertion and deletion is problematic and you have to change index entries,
Storage overhead is not a serious problem.
What is clustering indexes ?
If the data file is sequentially ordered on anon-key files and indexing filed corresponds to non-key.
Built for a data file sorted on a non-key field.
Index file is another sorted file whose records are fixed length with two fields:
first is of same data type as clustering field of data file.
Second is a pointer to disk blocks.
One entry for each distinct value and a pointer to the first block in data file that holds at least one record with the value of clustering field.
SPARSE
How is the performance of clustering index ?
Requires significantly FEWER BLOCKS , smaller in size that data field record
Binary search on index file requires fewer block accesses than data file.
Insertion and deletion is problematic. We have to move records in data file
Storage overhead is not a problem.
What is secondary index ?
An index defined on NON-ORDERING file of the data file.
A file can have one physical ordering field. It can either have primary / clustering not both.
It can have several secondary indexes as it doesn’t have any physical affect.
INdex file is SORTED file whose records are fixed/ variable in length with two fields:
First is same data type as indexing field
Second is a pointer to a disk block.
Define the case 1 of secondary index and its performance
This uses a DENSE(one entry for every record in data file) secondary index that maps to all records in a file.
When indexing filed is NOT ordering field then secondary index is on it where index field- SECONDARY INDEX.
One entry in index file for each entry in data file.
Each entry contains value of 2nd key for record and a pointer to the block/ record.
There may be DUPLICATE VLAUES IN INDEX FIELD.
BINARY SEARCH.
Needs more storage space and longer search items.
Define case 2 OF SECONDARY INDEX
Many records in data file have same value for indexing field.
User variable length records to hold an array of block pointers associated with indexing value field.
Use a single entry for each indexing field value. Extra level of redirection - multiple pointer.
SPARSE
What is sparse and dense index ?
Sparse- has an index record for some of the search key values.
Dense: has an index record for every search key value.
What are multi-level indexes ?
When an index file becomes large the search time increases, that is what this index tries to fix by :
treating index like any other file.
Split index into number of smaller indexes.
Maintain an index to the indexes
How is the performance of multi-level ?
Search performance increases when searching for specific indexing field value.
Problems with insertion and deletion- for this problem SOME SPACE IN EACH BLOCK IS LEFT FOR INSERTING NEW ENTRIES.
This technique is called DYNAMIC MULTI-LEVEL and implemented by BALANCED TREE(B and b+ trees)