DBMS - External Hashing Flashcards
Basic/Old Idea of Hashing
Decide on hash key, size of hash file, hash function. Hash the source data and place it in the file of RRN location indexed by the hash function.
To search for a record, hash its key and directly access record in hash file.
Hash address collision
hash addresses can collide
Hash key space
shouldnt be much larger than hash file address space
Hashing allows us to retrieve
a record on the basis of key value throughout use of a hashing function.
Hashing Uses and Issues
- Routine converts key value to int that corresponds to an address location in a hash file where record’s key values and pointers are kept
-Locations generated appear to be random
- With hashing, no of keys could be mapped into same address/location (collision)
- Another issue is exhaustion of pre-booked space
External hashing
Deals with hashing as a file indexing structure
Internal Hashing
Deals with main memory structures
If keyi > keyj
It is not necessary that h(keyi)>(keyj) unless h() is monotonic
Perfect hash function
Has no duplicate hashes
Minimal perfect hash function
has as many keys as hashes
Hash Function Examples
- Represent key in numerical form
- Prime Division
- Truncation
- Folding
Truncation
If key is 123456789 then truncation (of last 3 digits) will return an address of 789
Folding
Works by splitting the key into two and operating on these, so 1234 and 456 will add to an address of 1690
Prime Division
Say n is our file address size, divide key by a prime just greater than n (skewed dist)
Represent Key in numerical form
convert alphanumerics to numbers and apply function to a key to produce address