Hashed Files Flashcards
Define external hashing
External hashing is hashing on disk files.
In this method, blocks are divided into equal-sized buckets, with each bucket corresponding to a fixed number of blocks.
Hashing uses a hash key (K) for data retrieval. The hash key is one of the fields (attributes/columns) of the file. In order to determine which bucket a record is going to be stored in, the hash key is put through the hashing function. ie to determine which bucket(i):
i=h(K)
An example of a hashing function is 8 mod 10
When do collisions occur in hashing
Collisions in hashing occur when a record is hashed to a bucket that is already full
What is a hash index
The hash index is a secondary structure to access
the file by using hashing on a search key other
than the one used for the primary data file
organization
What are the index entries of a hash index
The index entries are of the type <K, Pr> or <K,
P>, where Pr is a pointer to the record containing
the key, or P is a pointer to the block containing
the record for that key
Describe searching for a record using a hash key
Searching for an entry uses the hash search
algorithm on K. Once an entry is found, the
pointer Pr (or P) is used to locate the
corresponding record in the data file.
What happens when there are thousands of buckets, how does the DBMS keep track of the buckets
In a practical application, there may be thousands
of buckets; the bucket number, which may
be several bits long, would be subjected to
the directory schemes
Do the example from 46 to 48
9320
What are the different methods for collision resolution
chaining
open addressing
multiple hashing
Describe multiple hashing
If a collision occurs, a second hash function is applied by the program. If another collision occurs, open addressing is used or a third hash function is applied and then the program uses open addressing as necessary
Describe chaining
For this method, various overflow locations are kept, usually by extending the array with a number of overflow positions.
A collision is resolved by placing the new record in an unused overflow location and setting the pointer of the occupied hash address location to the address of that overflow location.
Describe open addressing
Proceeding from the occupied position
specified by the hash address, the program checks the subsequent positions in order until an unused (empty) position is found
Go through the example on 51
skj