Tables and Hashing Flashcards
What is a ADT Table: Data?
- A table is a collection of entries, where each entry contains:
- a key(unique identifier for the entry)
- rest of the entry (other data associated with this entry
Sometimes called a dictionary
What are a table’s standard operations?
What are the posibiities for implementing a table and what are the advantages/disadvantages?
What is the most important table operation?
search/retrieve
How fast is a hash table, insert, delete and search?
O(1)
Means that execution time is independent of problem size, it takes a constant number of operations
What is the general idea of how to implement an ADT table using a hash table?(psudo code)
What is a Hash function?
A mathematical function that, given a key k gives an index h(k) into the hash table(an array) where the item with key k can be found.
Example:
- Keys are in 0,1,2,3,…
- So use h(k) = k
- Search, insert, delete would all take O(1) time in an array
What are the potential problems with the simple hash function?
The keys could be:
- Large (compared to a smaller array size)
- Alpha-numeric(non-numeric, at least)
- Collisions (two keys hashing to the same location)
What must a good hash function do?
- Minimize collisions
- fast to compute
- Hash values should be evenly distributed, even if the keys are not
What are the two steps of a hash function?
- Make the key an integer(called a hash code) - If it isn’t an integer, map the key to some integer
- Make the integer into an array index(compression map) - Map the hash code into the desired array index range.
When turning a key into a hash code that has the same size of int, such as short, char and float?
treat the key as an int, cast it to an int
Hash code: how do you turn a key that is longer than an int into a hash code?(example double or long)
- Split the key into int-size pieces and add them together
- Can’t just use one piece as the hash code, that piece may be identitical or similar to a lot of different keys, leading to clustering and collisions
How do you turn a non-integer key into an integer hash code?
- Divide the key into a sequence of int-sized chunks
- Treat each chunk as an int
- example String:
- Get the character at position i: word.charAt(i)
- cast the character at position i into an int: (int)word.charAt(i)
- example String:
- Turn the sequence of integers into one integer using the polynomial hash code
Why use a polynomial hash for Strings? Why can’t we just add the characters together?
The polynomial hash code gives uniform distribution on english words.
Just adding the characters together ignores the position (commutative property) which differentiates words. “stop”, “pots”,”opts” would all hash to the same index
How can you computer the polynomial hash code efficiently? What’s the difference in speed?
The inefficient is (s(s+1))/2 multiplications
Factor out the common factor (a) using Horner’s method.
With Horner’s method there is S additions and S multiplications