Hashing Flashcards
Implementation of chaining
// C++ program to demonstrate implementation of our // own hash table with chaining for collision detection #include(bits/stdc++.h> using namespace std;
struct MyHash { int BUCKET; list(int> *table; MyHash(int b) { BUCKET = b; table = new list(int>[BUCKET]; } void insert(int k) { int i = k % BUCKET; table[i].push_back(k); } bool search(int k) { int i = k % BUCKET; for (auto x : table[i]) if (x == k) return true; return false; } void remove(int k) { int i = k % BUCKET; table[i].remove(k); } };
// Driver method to test Map class int main() { MyHash mh(7); mh.insert(10); mh.insert(20); mh.insert(15); mh.insert(7); cout (( mh.search(10) (( endl; mh.remove(15); cout (( mh.search(15); }
what is open addressing in hashing collisions?
Let me explain open addressing hashing and provide a Java implementation with a thorough breakdown.
Open addressing is a collision resolution technique used in hash tables where all elements are stored directly in the hash table array itself, unlike separate chaining which uses linked lists or other data structures to handle collisions. When a collision occurs (two keys hash to the same index), the algorithm uses various probing strategies to find an alternative empty slot in the array.
There are several common probing techniques in open addressing:
- Linear Probing: When a collision occurs, the algorithm checks the next sequential slots in the array until an empty slot is found.
- Quadratic Probing: The algorithm uses a quadratic function to determine the next slot to check.
- Double Hashing: Uses a second hash function to calculate the probing interval.
Let’s implement a hash table using linear probing as our open addressing strategy:
public class OpenAddressingHashTable { // Array to store key-value pairs private Entry[] table; // Capacity of the hash table private int capacity; // Current number of elements in the hash table private int size; // Load factor threshold to trigger resizing private static final double LOAD_FACTOR_THRESHOLD = 0.75; // Special marker for deleted entries private static final Entry DELETED = new Entry(null, null); // Inner class to represent key-value entries private static class Entry { Object key; Object value; Entry(Object key, Object value) { this.key = key; this.value = value; } } // Constructor public OpenAddressingHashTable(int initialCapacity) { this.capacity = initialCapacity; this.table = new Entry[capacity]; this.size = 0; } // Hash function to calculate initial index private int hash(Object key) { return Math.abs(key.hashCode() % capacity); } // Insert a key-value pair into the hash table public void put(Object key, Object value) { // Check if resizing is needed if ((double) size / capacity >= LOAD_FACTOR_THRESHOLD) { resize(); } int index = hash(key); // Linear probing to find an empty or available slot while (table[index] != null && table[index] != DELETED) { // If key already exists, update the value if (table[index].key.equals(key)) { table[index].value = value; return; } // Move to next slot (linear probing) index = (index + 1) % capacity; } // Insert the new entry table[index] = new Entry(key, value); size++; } // Retrieve a value by its key public Object get(Object key) { int index = hash(key); // Linear probing to find the key while (table[index] != null) { if (table[index] != DELETED && table[index].key.equals(key)) { return table[index].value; } // Move to next slot index = (index + 1) % capacity; } // Key not found return null; } // Remove a key-value pair public void remove(Object key) { int index = hash(key); // Linear probing to find the key while (table[index] != null) { if (table[index] != DELETED && table[index].key.equals(key)) { // Mark as deleted instead of truly removing table[index] = DELETED; size--; return; } // Move to next slot index = (index + 1) % capacity; } } // Resize the hash table when load factor is exceeded private void resize() { int newCapacity = capacity * 2; Entry[] oldTable = table; // Create new table with increased capacity table = new Entry[newCapacity]; capacity = newCapacity; size = 0; // Rehash all existing entries for (Entry entry : oldTable) { if (entry != null && entry != DELETED) { put(entry.key, entry.value); } } } // Get the current number of elements public int size() { return size; } // Check if the hash table is empty public boolean isEmpty() { return size == 0; } }
Let me break down the key aspects of this implementation:
-
Linear Probing Strategy
When a collision occurs, the algorithm moves to the next slot sequentially. This is implemented in theput()
andget()
methods using(index + 1) % capacity
to wrap around the array. -
Deleted Entry Handling
We use a specialDELETED
marker to handle removals efficiently. This prevents breaking the probe sequence for subsequent lookups while allowing new insertions in those slots. -
Resizing
The hash table automatically resizes when the load factor (current elements / total capacity) exceeds 0.75. This helps maintain performance by reducing collisions.
Key challenges in open addressing include:
- Handling collisions efficiently
- Avoiding clustering (consecutive filled slots)
- Managing deletions without breaking probe sequences
Example usage:
~~~
OpenAddressingHashTable hashTable = new OpenAddressingHashTable(10);
hashTable.put(“key1”, “value1”);
hashTable.put(“key2”, “value2”);
System.out.println(hashTable.get(“key1”)); // Prints “value1”
~~~
Trade-offs compared to separate chaining:
- Pros: Better cache performance, less memory overhead
- Cons: More complex implementation, more sensitive to hash function quality
Implement linear probing
Linear probing is a technique used to resolve hash collisions in a hash table. When two keys hash to the same index (collision), linear probing searches for the next available slot in the hash table by sequentially checking the subsequent indices.
Key Points:
Collision Resolution:
If a collision occurs at index i, the algorithm checks the next slot (i+1), (i+2), and so on, wrapping around to the start of the array if necessary (circular probing).
Insertion:
The key is placed in the first empty slot found during the sequential search.
Search:
If the desired key is not at the initial hash index, the search continues linearly until:
The key is found.
An empty slot is reached, indicating the key does not exist.
Deletion:
Special care is needed during deletion. A placeholder (e.g., a tombstone) might be used to indicate a deleted slot, ensuring it doesn’t disrupt the search for other keys.
Performance:
Clustering can occur as collisions pile up, leading to longer probing sequences (primary clustering). This degrades performance, especially when the table is highly loaded.
Load Factor:
Linear probing performs well if the load factor (number of elements ÷ table size) is kept low, typically below 0.7.
Example:
Consider a hash table of size 7 and a hash function h(key) = key % 7. Insert the keys: 10, 20, 15.
Insert 10:
h(10) = 10 % 7 = 3.
Index 3 is empty, so insert 10 at index 3.
Insert 20:
h(20) = 20 % 7 = 6.
Index 6 is empty, so insert 20 at index 6.
Insert 15:
h(15) = 15 % 7 = 1.
Index 1 is empty, so insert 15 at index 1.
Search for 15:
Start at h(15) = 1, find the key 15.
Insert 22:
h(22) = 22 % 7 = 1.
Index 1 is occupied, so check index 2, which is empty. Insert 22 at index 2.
This demonstrates how linear probing resolves collisions by sequentially finding the next free slot.
Quadratic probing
Quadratic Probing: Quadratic probing is an open-addressing scheme where we look for i2‘th slot in i’th iteration if the given hash value x collides in the hash table. How Quadratic Probing is done? Let hash(x) be the slot index computed using the hash function.
If the slot hash(x) % S is full, then we try (hash(x) + 11) % S.
If (hash(x) + 11) % S is also full, then we try (hash(x) + 22) % S.
If (hash(x) + 22) % S is also full, then we try (hash(x) + 3*3) % S.
This process is repeated for all the values of i until an empty slot is found.
void QuadraticProbing(vector (int> &hash, int hashSize,int arr[],int N) { //storing -1 at all indexes in the hash table. for(int i=0;i(hashSize;i++) hash[i] = -1;
//iterating over the array. for(int i=0;i(N;i++) { //if the value of hash table at index (arr[i]%hashSize) is -1 //which means empty then, we insert arr[i] at that place. if(hash[arr[i]%hashSize]==-1) { hash[arr[i]%hashSize]=arr[i]; } //else we quadratically traverse the array from the filled position //to find an index with -1 in hash table. else { int k=arr[i]%hashSize; int power=1;
//using a loop which runs until we find an index with -1 //in hash table which means empty. while(hash[(k+power*power)%hashSize]!=-1) { power++; } //inserting arr[i] after finding the empty position. hash[(k+power*power)%hashSize]=arr[i]; } } }