Python - DS: Hash Maps Flashcards
A ______ function returns an array index as output.
A hashing function returns an array index as output.
In a hash map we ( DO / DO NOT ) care about the exact sequence of data.
We DO NOT really care about the exact sequence of data.
We only care that a given input, when fed into the map, gives the accurate output.
In order for it to return an array index, our hash map implementation ( NEEDS TO KNOW / DOES NOT NEED ) TO KNOW the size of our array.
In order for it to return an array index, our hash map implementation NEEDS TO KNOW the size of our array.
If the array we are saving values into only has 4 slots, our hash map’s hashing method should not return an index bigger than that.
In order for our hash map implementation to guarantee that it returns an index that fits into the underlying array, the hash function will first compute a value using some scoring metric: this is the hash value, hash code, or just the hash.
Our hash map implementation then takes that hash value _______ of the array. This guarantees that the value returned by the hash function can be used as an index into the array we’re using.
In order for our hash map implementation to guarantee that it returns an index that fits into the underlying array, the hash function will first compute a value using some scoring metric: this is the hash value, hash code, or just the hash.
Our hash map implementation then takes that hash value MOD THE SIZE of the array. This guarantees that the value returned by the hash function can be used as an index into the array we’re using.
It is actually a defining feature of all hash functions that they greatly reduce any possible inputs (any string you can imagine) into a much smaller range of potential outputs (an integer smaller than the size of our array). For this reason hash functions are also known as ________functions.
It is actually a defining feature of all hash functions that they greatly reduce any possible inputs (any string you can imagine) into a much smaller range of potential outputs (an integer smaller than the size of our array). For this reason hash functions are also known as COMPRESSION functions.
Much like an image that has been shrunk to a lower resolution, the output of a hash function contains less data than the input. Because of this hashing ( IS / IS NOT ) a reversible process. With just a hash value it is ( POSSIBLE / IMPOSSIBLE ) to know for sure the key that was plugged into the hashing function.
Much like an image that has been shrunk to a lower resolution, the output of a hash function contains less data than the input. Because of this hashing IS NOT a reversible process. With just a hash value it is IMPOSSIBLE to know for sure the key that was plugged into the hashing function.
Now that we have all of the main ingredients for a hash map, let’s put them all together. First, we need some sort of associated data that we’re hoping to preserve. Second, we need an array of a ( FIXED / VARIABLE ) size to insert our data into. Lastly, we need a hash function that translates the keys of our array into ______ into the array. The storage location at the index given by a hash is called the ______.
Now that we have all of the main ingredients for a hash map, let’s put them all together. First, we need some sort of associated data that we’re hoping to preserve. Second, we need an array of a FIXED size to insert our data into. Lastly, we need a hash function that translates the keys of our array into INDEXES into the array. The storage location at the index given by a hash is called the HASH BUCKET.
Hash functions are designed to compress data from a ( LARGE / SMALL ) number of possible keys to a much ( LARGER / SMALLER ) range.
Hash functions are designed to compress data from a LARGE number of possible keys to a much SMALLER range.
Because of this compression, it’s likely that our hash function might produce the same hash for two different keys. This is known as a ________.
Because of this compression, it’s likely that our hash function might produce the same hash for two different keys. This is known as a HASH COLLISION.
A hash map with a linked list separate chaining strategy follows a similar flow to the hash maps that have been described so far. The user wants to assign a value to a ____ in the map. The hash map takes the key and transforms it into a _______. The hash code is then converted into an ______ to an array using the ________. If the value of the array at the hash function’s returned index is empty, a new _________ is created with the value as the first element of the linked list. If a linked list already exists at the address, _______ the value to the linked list given.
A hash map with a linked list separate chaining strategy follows a similar flow to the hash maps that have been described so far. The user wants to assign a value to a KEY in the map. The hash map takes the key and transforms it into a HASH CODE. The hash code is then converted into an INDEX to an array using the MODULUS OPERATION. If the value of the array at the hash function’s returned index is empty, a new LINKED LIST is created with the value as the first element of the linked list. If a linked list already exists at the address, APPEND the value to the linked list given.
Now, when we go to read or write a value for a key we do the following:
(RE-ARRANGE IN THE CORRECT ORDER)
(a) find the appropriate index for that hash
(b) otherwise, continue iterating through the list comparing the keys saved in that list with our key
(c) calculate the hash for the key
(d) begin iterating through our linked list
(e) for each element, if the saved key is the same as our key, return the value
Now, when we go to read or write a value for a key we do the following:
(THE CORRECT ORDER)
(c) calculate the hash for the key
(a) find the appropriate index for that hash
(d) begin iterating through our linked list
(e) for each element, if the saved key is the same as our key, return the value
(b) otherwise, continue iterating through the list comparing the keys saved in that list with our key
Another popular hash collision strategy is called __________. In this method we stick to the array as our underlying data structure, but we continue looking for a new index to save our data if the first result of our hash function has a different key’s data.
A common open method of this strategy is called ______. This means continuing to find new array indices in a fixed sequence until an empty index is found.
Another popular hash collision strategy is called OPEN ADDRESSING. In open addressing we stick to the array as our underlying data structure, but we continue looking for a new index to save our data if the first result of our hash function has a different key’s data.
A common open method of open addressing is called PROBING. Probing means continuing to find new array indices in a fixed sequence until an empty index is found.
_______ is what happens when a single hash collision causes additional hash collisions. Imagine a hash collision triggers a linear probing sequence to assigns a value to the next hash bucket over. Any key that would hash to this “next bucket” will now collide with a key that, in a sense, doesn’t belong to that bucket anyway.
CLUSTERING is what happens when a single hash collision causes additional hash collisions. Imagine a hash collision triggers a linear probing sequence to assigns a value to the next hash bucket over. Any key that would hash to this “next bucket” will now collide with a key that, in a sense, doesn’t belong to that bucket anyway.
There are different hash collision strategies. Two important ones are __________, where each array index points to a different data structure, and __________, where a collision triggers a probing sequence to find where to store the value for a given key.
There are different hash collision strategies. Two important ones are SEPARATE CHAINING, where each array index points to a different data structure, and OPEN ADDRESSING, where a collision triggers a probing sequence to find where to store the value for a given key.
Given a hash table that handles collisions, which of the following need to be saved so that values can be assigned, reassigned, and looked up?
The Key
The Hash Code
The Array Index
The Value
The key and the value.