Hash Tables Flashcards
Hashing
- A technique used for performing insertions, deletions, and searches on constant average time
- Tree operations that require any ordering information among the elements are not supported efficiently
Hashing Data Structure
Hash Table (generalization of ordinary arrays) is used to implement hashing.
Hash Table Components
Key and entry
How to find an entry?
- To find an entry in the table, you only need to know the content of one of the fields (not all of them). This field is called key
- Ideally, a key uniquely identifies an entry
Direct Adress table
-Direct-address tables are ordinary arrays
- Facilitate direct addressing
- Element whose key is k is obtained by indexing into the kth position of the array
- It is applicable when we can afford to allocate an array with one position for every possible key
- Insert/Delete/Search can be implemented to take O(1) time
Disadvantage of direct-address table
The disadvantage is that we could end up with a big array that has a lot of empty space. This wastes lots of memory. The bigger we scale, the worse this problem gets.
Cons direct-address table
- If U is large, then a direct-address table T of size |U| is impractical or impossible (considering memory)
- ->The range of keys determines the size of T
-The set K could be so small relative to U that most of the allocated space is wasted
Direct-Address table vs. Hash Table
-Direct-address table
o Element with key k is stored in slot k
-Hash table
o Use hash function h(k) to compute the slot where k will be inserted
Hash table big idea
- We can make the array smaller, and use the hash function to select which slot
we put our values into. For example the hash function = modulus function - Basically, we are compressing the domain of our keys so that they fit into the array
Hash table size
-The size of the hash table is proportional to |K|
o We lose the direct-addressing ability
o We need to define functions that map keys to slots of the hash table
Hash function definition
-A hash function h transforms a key k into a number that is used as an index in an array to locate the desired location (“bucket” or “slot”)
o An element with key k hashes to slot h(k) 11.2 Hash tables o h(k) is the hash value of key k
Hash function pros
oSimple to compute
oAny two distinct keys get different cells
oThis is impossible, thus we seek a hash function that distributes the keys evenly among the
cells
Hash table components
- The ideal hash table is an array of some fixed size m, containing items (key and additional entry)
- Each key k is mapped (using a hash function) into some number in the range 0 to m – 1 and places in the appropriate cell
Hash table big O for operations
-Insertion of a new entry is O(1)
-Lookup for data is O(1) on average
oStatistically unlikely that O(n) will occur
Issues with hashing
- Decide what to do when two keys hash to the same value (collision)
- Choose a good hash function
- Choose the size of the table