Rehashing Flashcards

1
Q

Choose size of table

A
  • Too big will waste memory; too small will increase collision and may eventually force rehashing
  • Should be appropriate for the hash function used
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Load factor

A

-A characteristic of the hash table is its load factor α

-Given a hash table T with m slots that stores n elements,
α= n/m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alpha and collision resolution policies

A

§ Hashing with chaining: α can be greater than 1
o Performance degrades with length of chains
o Expected length of a linked list = load factor = a = n/m

§ Hashing with open addressing: 0 ≤ α ≤ 1
o The probability of collisions increases as the load factor increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Load factor size

A

§ As the load factor grows larger the hash table becomes slower
o If the table is close to full, the search time grows and may become equal to the size of the
table

§ Low load factor is not especially beneficial. As low factor approaches 0, unused areas in the table increases

§ The expected constant time (for insert, search, and delete) of a hash table assumes that the load factor is kept below some bound

§ Good strategy α < 0.5, i.e. the size of the table is more than twice the number of the records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When do we do rehashing?

A

§ If the load factor exceeds a certain value (greater than 0.5) we do rehashing

§ Build a second table twice as large as the original and rehash there all the keys of the original table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Rehashing Big O analysis

A

§Obviously rehashing is an expensive operation, O(n)

o There are n elements to rehash and the table size is roughly 2n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Rehashing with Cuckoo Hashing

A

§A problem with cuckoo hashing is the probability of having a cycle that prevents the insertion from completing

§ If the table’s load factor is below 0.5, it has been shown that the probability of a cycle is very low

§ Rehashing is performed when the load factor is at 0.5 or higher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a good hash function?

A

oBe easy and quick to compute
oAchieve an even distribution of the key values that actually occur across the
index range supported by the table oMinimal number of collision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Perfect hashing

A

§The best hash function are those that map every unique key to a unique integer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Some hash functions

A

§ Middle of square
-h(k) returns middle digits of k2

§ Division
-h(k) returns k % m (m size of the table)

§ Multiplicative
-h(k) returns the first few digits of the fractional part of k*A, where A is a fraction
A = ( 5 − 1) 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Other hash functions (don’t look too much into it, not on exam)

A

§ Truncation
oIgnore part of the key and use the rest as the array index (converting non-numeric parts) oIf students have an 9-digit identification number, take the last 3 digits as the table position
(e.g. 925371622 becomes 622)

§ Folding
◦ Partition the key into several parts and then combine them in any convenient way
◦ Unlike truncation, uses information from the whole key. Split a 9-digit number into three 3-
digit numbers, and add them (e.g. 925371622 becomes 925 + 376 + 622 = 1923)

§Digit analysis
◦ If all the keys have been known in advance, then we could delete the digits of keys having the most skewed distributions, and use the rest digits as hash address

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Polynomial Hash function (similar to what I did on PA)

A

Polynomial hash function (or Horner’s method)
o Treat the binary representation of a key as a number
o How the keys are treated as numbers?
o Represent each character with p bits, then the string can be treated as base 2p number

AKEY : 00001 01011 00101 11001
= 1Ÿ323 + 11Ÿ322 + 5Ÿ321+ 25Ÿ320 = 44271

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Universal hashing

A

In the case where: all keys all map to the same slot, giving worst-case behavior

use universal hashing:

o Use a different random hash function each time
o Ensure that the random hash function is independent of the keys that are actually going to be stored

o Ensure that the random hash function is “good” by carefully designing a class of functions to choose from
-Design a universal class of functions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why use hash tables?

A

§Hash tables support fast insert and search
-O(1) average case performance

  • Deletion is possible, but degrades performance
  • Not suited if ordering of element is important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly