Chapter 28 - Hashing Flashcards
A map (data structure) is also called ___
A map is also called a dictionary, a hash table, or an associative array.
The array that stores the values is called a __(1)__. The function that maps a key to an index in the __(1)__ is called a __(2)__.
(1) hash table
(2) hash function
What is hashing?
Hashing is a technique that retrieves the value using the index obtained from the key without performing a search.
What is a perfect hash function? What is a collision?
Ideally, we would like to design a function that maps each search key to a different index in the hash table. Such a function is called a perfect hash function. However, it is difficult to find a perfect hash function. When two or more keys are mapped to the same hash value, we say that a collision has occurred. Although there are ways to deal with collisions, it is better to avoid collisions in the first place. Thus, you should design a fast and easy-to-compute hash function that minimizes collisions.
What class has the hashCode method? What does the method return by default?
The Object class, Java’s big daddy class.
hashCode returns the memory address for the object by default.
What is the general contract for the hashCode method?
- You should override the hashCode method whenever the equals method is overridden to ensure that two equal objects return the same hash code.
- During the execution of a program, invoking the hashCode method multiple times returns the same integer, provided that the object’s data are not changed.
- Two unequal objects may have the same hash code, but you should implement the hashCode method to avoid too many such cases.
What does “the contract of a class” mean?
Wouldn't it be nice if all Java classes that you use, including your own, lived up to their promises? In fact, wouldn't it be nice if you actually knew exactly what a given class promises? It's an agreement that the class will expose certain methods, certain properties, and certain behaviors. The contract specifies what that component expects of clients and what clients can expect of it.
What is hash code? What is the hash code for Byte, Short, Integer, and Character?
A typical hash function first converts a search key to an integer value called a hash code, and then compresses the hash code into an index to the hash table.
For a search key of the type byte, short, int, and char, simply cast it to int. So two different search keys of any one of these types will have different hash codes.
How is the hash code for a Float object computed?
For a search key of the type float, use Float.floatToIntBits(key) as the hash code. Note that floatToIntBits(float f) returns an int value whose bit representation is the same as the bit representation for the floating number f. So, two different search keys of the float type will have different hash codes.
How is the hash code for a Long object computed?
For a search key of the type long, simply casting it to int would not be a good choice, because all keys that differ in only the first 32 bits will have the same hash code. To take the first 32 bits into consideration, divide the 64 bits into two halves and perform the exclusive or operator to combine the two halves. This process is called folding.
So, the hashing code is:
int hashCode = (int)(key ^ (key»_space; 32));
Note that»_space; is the right-shift operator that shifts the bits 32 positions to the right.
How is the hash code for a Double object computed?
For a search key of the type double, first convert it to a long value using doubleToLongBits, then perform a folding as follows:
long bits = Double.doubleToLongBits(key);
int hashCode = (int)(bits ^ (bits»_space; 32));
How is the hash code for a String object computed?
The hash code for a string in Java is:
(…((s0*b + s1)b + s2)b + … + sN-2)b + sN-1
where b is 2
How is a hash code compressed to an integer representing the index in a hash table?
The hash code for a key can be a large integer that is out of the range for the hash table index. You need to scale it down to fit in the range of the index. Assume the index for a hash table is between 0 and N-1. The most common way to scale an integer to between 0 and N-1 is to use:
h(hashCode) = hashCode % N
To ensure that the indices are spread evenly, choose N to be a prime number greater than 2.
What are the two ways of handling collisions?
open addressing and separate chaining
What is open addressing?
Open addressing is the process of finding an open location in the hash table in the event of a collision. Open addressing has several variations: linear probing, quadratic probing, and double hashing
In relation to open addressing collision-handling, what is linear probing, and how does it work?
When a collision occurs during the insertion of an entry to a hash table, linear probing finds the next available location sequentially. For example, if a collision occurs at hashTable[k % N], check whether hashTable[(k+1) % N] is available. If not, check hashTable[(k+2) % N] and so on, until an available cell is found.
When probing reaches the end of the table, it goes back to the beginning of the table. Thus, the hash table is treated as if it were circular.