121 Week 10 - Indexed Retrieval Flashcards
What is indexed retrieval used for
Storing and retrieving objects or records
How does using keys for storage work
Each stored object has an associated key to it which is an attribute of the object.
This key is used for retrieval and storage of the object.
The key is a unique name or identifier for each object.
Why are keys used
Use of keys allows a set of objects / records to be
stored in sorted order meaning binary search can be
used for retrieval.
How does indexed retrieval work
Indexed retrieval uses keys to store objects as well however it differs in that keys are sorted into indexes.
Keys can fall into a categories or “sections” e.g., the starting letter of the key if the key was a name.
The first occurrence of each section can be stored in an index e.g., index[0] could be the first occurrence of names beginning with A and index[1] could be the first occurrence of name beginning with B etc.
Why is indexed retrieval used
It allows for more efficient searching of large data sets as you only need to do a search of the correct index and can ignore all other parts of the data set.
E.g., for an indexed data set of names, it could be up to 26x faster than without indexing.
Worst case of indexed retrieval with linear search
Worst case is when all keys are in the same section resulting in Big O of O(n).
Worst case of indexed retrieval with binary search
If some sections are larger we can do a binary search on them to be more efficient.
This has the same worst case scenario when all keys are in the same section resulting in a big O of O(log n)
Problem with indexed retrieval
To add an element to the storage, all indexes must be recomputed meaning adding elements is inefficient.
Indexed retrieval with chains
Indexed retrieval with chains works similarly to regular indexed retrieval however instead of the index just pointing to the first instance, it points to the start of a chain.
Each item from where the index points to will point to the next item for that index. Because chains are dynamic, inserting new objects just means changing where the relevant chains point to instead of recomputing the entire index.
Worst case of indexed retrieval with chains
Worst case is the same as linear search where every item is on the same chain resulting in a big O of O(n).
Although linear and chain indexed retrieval have the same big O, chains is often faster on average if the chains are more evenly distributed.