Rehashing Flashcards

Question 1

Q

Choose size of table

Answer

A

Too big will waste memory; too small will increase collision and may eventually force rehashing
Should be appropriate for the hash function used

Question 2

Q

Load factor

Answer

A

-A characteristic of the hash table is its load factor α

-Given a hash table T with m slots that stores n elements,
α= n/m

Question 3

Q

Alpha and collision resolution policies

Answer

A

§ Hashing with chaining: α can be greater than 1
o Performance degrades with length of chains
o Expected length of a linked list = load factor = a = n/m

§ Hashing with open addressing: 0 ≤ α ≤ 1
o The probability of collisions increases as the load factor increases

Question 4

Q

Load factor size

Answer

A

§ As the load factor grows larger the hash table becomes slower
o If the table is close to full, the search time grows and may become equal to the size of the
table

§ Low load factor is not especially beneficial. As low factor approaches 0, unused areas in the table increases

§ The expected constant time (for insert, search, and delete) of a hash table assumes that the load factor is kept below some bound

§ Good strategy α < 0.5, i.e. the size of the table is more than twice the number of the records

Question 5

Q

When do we do rehashing?

Answer

A

§ If the load factor exceeds a certain value (greater than 0.5) we do rehashing

§ Build a second table twice as large as the original and rehash there all the keys of the original table

Question 6

Q

Rehashing Big O analysis

Answer

A

§Obviously rehashing is an expensive operation, O(n)

o There are n elements to rehash and the table size is roughly 2n

Question 7

Q

Rehashing with Cuckoo Hashing

Answer

A

§A problem with cuckoo hashing is the probability of having a cycle that prevents the insertion from completing

§ If the table’s load factor is below 0.5, it has been shown that the probability of a cycle is very low

§ Rehashing is performed when the load factor is at 0.5 or higher

Question 8

Q

What is a good hash function?

Answer

A

oBe easy and quick to compute
oAchieve an even distribution of the key values that actually occur across the
index range supported by the table oMinimal number of collision

Question 9

Q

Perfect hashing

Answer

A

§The best hash function are those that map every unique key to a unique integer.

Question 10

Q

Some hash functions

Answer

A

§ Middle of square
-h(k) returns middle digits of k2

§ Division
-h(k) returns k % m (m size of the table)

§ Multiplicative
-h(k) returns the first few digits of the fractional part of k*A, where A is a fraction
A = ( 5 − 1) 2

Question 11

Q

Other hash functions (don’t look too much into it, not on exam)

Answer

A

§ Truncation
oIgnore part of the key and use the rest as the array index (converting non-numeric parts) oIf students have an 9-digit identification number, take the last 3 digits as the table position
(e.g. 925371622 becomes 622)

§ Folding
◦ Partition the key into several parts and then combine them in any convenient way
◦ Unlike truncation, uses information from the whole key. Split a 9-digit number into three 3-
digit numbers, and add them (e.g. 925371622 becomes 925 + 376 + 622 = 1923)

§Digit analysis
◦ If all the keys have been known in advance, then we could delete the digits of keys having the most skewed distributions, and use the rest digits as hash address

Question 12

Q

Polynomial Hash function (similar to what I did on PA)

Answer

A

Polynomial hash function (or Horner’s method)
o Treat the binary representation of a key as a number
o How the keys are treated as numbers?
o Represent each character with p bits, then the string can be treated as base 2p number

AKEY : 00001 01011 00101 11001
= 1323 + 11322 + 5321+ 25320 = 44271

Question 13

Q

Universal hashing

Answer

A

In the case where: all keys all map to the same slot, giving worst-case behavior

use universal hashing:

o Use a different random hash function each time
o Ensure that the random hash function is independent of the keys that are actually going to be stored

o Ensure that the random hash function is “good” by carefully designing a class of functions to choose from
-Design a universal class of functions

Question 14

Q

Why use hash tables?

Answer

A

§Hash tables support fast insert and search
-O(1) average case performance

Deletion is possible, but degrades performance
Not suited if ordering of element is important

Rehashing Flashcards

(14 cards)