Hash Tables Collision Resolution Flashcards
What are bad examples of solving collision resolution?
- Making the array much bigger than we expect to store
- collision likelyhood is still really high (50% in a table of size 500)
- Open Addressing
- after a colission probe a sequence of slots in the hash table (AKA closed hashing)
- suffers from clustering and collisions before more likely as more insertions happen
- after a colission probe a sequence of slots in the hash table (AKA closed hashing)
- Linear Probing
- when a collision occurs search the slots immediately following the array index hashed to until an empty slot is found
- Suffers from cookie monster affect - later searches become increasingly expensive
What is the difference between closed and open hashing
closed(aka open addressing) - table is a fixed amount of storage and every item is plces in a hash table slot in that limited storage
open - items are stored outside the table
What is separate chaining?
Store all items that hash to a particular index in that slot.
Each table slot could be a linkedlist of items that has to that slot
How do you implement the code for separate chaining?
data:image/s3,"s3://crabby-images/18cc6/18cc64a9959f087e9191e98aff64390c5b7ad0c3" alt=""
How does an insert and search work with separate chaining and how does it take
- When a collision occurs it adds it to the front of the linked list at the location in the table, which makes it O(1).
- Searches have to search a linked list if there are key collisions
- You could keep the list ordered but inserts become less efficient
How does traversal work with a hash table both for ordered and unordered
It is one of the standard operations, print/traverse
- un-Ordered:
- using open addressing we traverse all items moving down the array, skipping over empty or deleted-flagged slots
- separate chaning we simply traverse each linked list
- Ordered(items in hash tables are not usually ordered:
- sort the hash table (usually costs O(nlogn)
- we could use O(n) extra space by copying into a sarapate array if we don’t want to lose the hash order in the hash table
- sort the hash table (usually costs O(nlogn)
What is the formula for probability of a collision?
data:image/s3,"s3://crabby-images/d703f/d703fb7eb6e62dd5a89652f8b820920fd42a9045" alt=""
What is load factor?
data:image/s3,"s3://crabby-images/9de77/9de774a9c58f7368beba4b6a95ca38e66abce343" alt=""
What needs to be taken into consideration when measuring effectiveness of hashing techniques
- assumes random data and a good hash function(that spreads and inserts around evently)
- real life data is often not random
- adjust your hash function to account for biases in the data
data:image/s3,"s3://crabby-images/b3ce1/b3ce163ba67e01072f671fe44176d2e9f0da12d5" alt=""
What is the comparison for linear probing, double hashing and separate chaining for load factor?
data:image/s3,"s3://crabby-images/12101/12101667aaf2fef4252265dc65389224de839927" alt=""