Hashing Flashcards

Question 1

Q

Hash Table

Answer

A

Data structure used to implement sets and dictionaries
Implemented with an array (often an array of linked lists) and a hash function
Good performance time
But ineffective for ordering/sorting

Question 2

Q

Collisions

Answer

A

Collisions occur when multiple elements map to the same index
2 ways to resolve collissions
1. Open addressing
2. Separate Chaining

Question 3

Q

Clustering

Answer

A

Clustering is a situation in which collisions cause elements to group around the same index
Clustering is the result of open addressing
Clustering slows down performance speed
The larger the cluster, the more likely more collisions will occur

Question 4

Q

Hash Functions

Answer

A

Function used to compute index of item in hash table

Often consists of 2 things

Hash function generates a new #
Then you modulo that # by array_size to compute index

5 things make a good hash function

Use all data being hashed
Use only data being hashed
Should be deterministic
Map similar keys to different values
Hash values should be distributed evenly

Question 5

Q

Load Factor

Answer

A

Ratio of elements in hash table to number of buckets
N/M
Where M is the size of the array
As load factor grows, hash table becomes slower
With open addressing, it’s the percentage of occupied cells
- Good load factor is between 1/8 and 1/2
With separate chaining, it’s the average length of the linked lists

Question 6

Q

Hashing

Folding Method

Answer

A

A type of hash function for integers

Steps

Break #’s into equal chunks (groups of 2 or 3, etc.)
Sum chunks to produce hash value
Mod the result

Question 7

Q

Hashing

Multiplication Method

Answer

A

A type of hash function for numbers
Uses the following formula

h(k) = floor[m(kA mod 1)]

m ⇒ size of the array
k ⇒ key
A ⇒ number between 0 and 1
- A good choice is the golden ratio
- (root(5) - 1)/2

Question 8

Q

Hashing

Strings

Answer

A

A common approach to hashing strings is to convert each char to a # and sum

Steps

Loop through char, adding up ascii values
Mod the total

Question 9

Q

Open Addressing

Answer

A

An implementation of hash functions that involves “probing” for free slots when collisions occur
If slot is occupies, insert into next free slot
If you reach end of array, loop through to beginning and search from top
This approach can cause clustering
Also, hash table is of fixed size, so must resize (and rehash) periodically
2 versions
1. Linear probing
2. Quadratic probing

Question 10

Q

Linear Probing

Answer

A

A form of open addressing
Inervals between probes is a fixed constant (usually 1)
Uses formula

h(x) + c

Where size of interval c >= 1
Using larger intervals reduces clustering
Ex: if index 3 is filled, and interval == 2, then check 5, 7, 9, etc.

Question 11

Q

Quadratic Probing

Answer

A

A form of open addressing
Intervals between probes increase quadratically
This approach decreases clustering
Uses formula:

i(k) + c₁p+ c₂p²

Where k is the key, and i(k) is the hash index, and p the number of probes
Note you can include a linear component, but c₁ is often 0
Ex: if index 1 is filled, and c₁ = 0 and c₂ = 1, then check slots 2, 5, 10, 17, etc.

Question 12

Q

Separate Chaining

Answer

A

An implementation of hash tables that uses an array of linked lists
Each index is the head of its own list
Prevents clustering

Question 13

Q

Hash Table

Complexity

Answer

A

Applies to both linear probing and separate chaining
Applies to all operations (insertion, deletion, and search (get))

Complexity

Time

Average/Amortized: O(1) ♦ Worst: O(n)

Space

O(1)

*Worst case delete/search from looping through elements that hash to same value, worst case insert from rehashing operation (reallocating array)

Brainscape's Knowledge GenomeTM

Hashing Flashcards

Brainscape's Knowledge Genome^TM