hashing-flashcards
What is hashing?
A technique that maps data of arbitrary size to fixed-size values using a hash function. It’s used to enable quick data lookup and storage
What is a hash function?
A function that takes an input (or ‘key’) and returns a fixed-size string of characters. A good hash function is deterministic and distributes values uniformly across its output range
What is a hash table?
A data structure that implements an associative array by using a hash function to compute an index into an array where the desired value can be found
What is a hashcode?
An integer value that represents the result of applying a hash function to data. In many programming languages it’s used to determine where to store an object in a hash table
How do table size and hash function quality affect hash table performance?
Table size affects collision probability - larger tables reduce collisions but use more memory. Hash function quality affects distribution - better functions spread values more evenly reducing collisions
What is the computational complexity of hash table operations?
Average case: O(1) for insert/search/delete. Worst case: O(n) if many collisions occur. Good hash functions and appropriate table size help maintain O(1) average performance
What are collisions?
Situations where a hash function produces the same hash value for different input data causing two or more items to map to the same location in the hash table
What is separate chaining?
A collision resolution method where each hash table slot contains a linked list of elements that hash to that slot. Multiple items can be stored at the same index by adding them to the list
What is linear probing?
A collision resolution method where if a collision occurs we linearly search for the next empty slot in the table. If we reach the end we wrap around to the beginning
What is quadratic probing?
A collision resolution method that uses quadratic function (i^2) to find the next available slot when collision occurs. Helps reduce clustering compared to linear probing
What is double hashing?
A collision resolution technique that uses two hash functions. The second function determines the step size for probing making the probe sequence more uniform
What is lazy removal in open addressing?
A technique where deleted items are marked as ‘deleted’ rather than actually removed. This maintains proper probe sequences for finding other items that may have been placed after the deleted item
What are the advantages and disadvantages of separate chaining?
advantages:
1) Simple to implement
2) Performance degrades gracefully
3) No limit on load factor
4) Works well with poor hash functions.
Disadvantages:
Extra memory for linked lists
Poor cache performance
What are the advantages and disadvantages of open addressing?
Advantages:
1) Better cache performance
2) No extra memory needed for links
3) Can store elements in contiguous memory.
Disadvantages:
1) Performance degrades with high load factors; 2) Complex deletion;
3) Requires good hash function