Hashing Flashcards
what is a bucket?
A unit of storage that holds one or more records with the same hash value, typically a disk block
what is static hashing? pros and cons
static hashing maps search values to a fixed set of buckets
pros:
cons:
can be wasteful if we have too many buckets
can be slow if we have to use overflow buckets to make up for small size
how are static bucket numbers determined? what’s fudge?
number of buckets = (num records/num records can fit per bucket) x (1+d)
fudge is d
d = is a fudge factor that reduces risk of overflow by allowing for 20% waste
what is chaining?
when we have overflow we can link buckets together to increase the size
how does a static hashing scheme grow?
have to rehash everything with a new hash function.
if we chose a hash based on the current bucket sizes we’re going to have back performance as the buckets grow
if we anticipate we waste space
what is dynamic hashing?
allows the database to grow and shrink by splitting and coalescing buckets, retaining space efficiency
a hash function generates values over a large range of typically 32 bit integer, and we only use a few bits of it at a time
the first i bits of the hash are used as an index n into the BAT (bucket address table)