13 BLOOM FILTERS Flashcards
What is a Bloom filter?
A data structure that optimizes memory usage and execution time to track keys and answer if a key has been seen before.
Who invented Bloom filters and when?
Burton Bloom in 1970.
What simple question does a Bloom filter answer?
Have we previously seen this key?
What is the main drawback of Bloom filters?
They can produce false positives.
What is guaranteed about Bloom filters regarding false negatives?
They guarantee a lack of false negatives.
How does a Bloom filter improve search efficiency?
By first checking the Bloom filter before accessing a larger data structure.
What happens when a Bloom filter returns false?
The record is not in the data set, and the expensive lookup can be skipped.
What type of data structure is a Bloom filter fundamentally?
An array of binary values.
In a Bloom filter, what does a value of 1 indicate?
That a given bin has been seen before.
What is the primary purpose of the Bloom filter?
To filter over a large key space with low memory overhead.
How does a Bloom filter handle collisions?
By using multiple independent hash functions.
What is the result of using k independent hash functions in a Bloom filter?
It reduces the likelihood of false positives.
Fill in the blank: A Bloom filter uses _______ to map keys to indices.
k independent hash functions.
True or False: A Bloom filter can guarantee that a key is in the data set if it returns true.
False.
What is the relationship between the size of the Bloom filter and the number of hash functions?
Balancing them can reduce the probability of false positives.
What analogy is used to explain how a Bloom filter works?
Asking a knowledgeable event organizer about a friend in a crowded ballroom.
What is a significant advantage of using a Bloom filter in searching?
It can avoid many pointless searches.
What happens when too many values are inserted into a simple binary indicator array?
It leads to hash collisions and false positives.
How does a Bloom filter determine if a key exists?
All k array values must be 1 for the key to be considered present.
What is an example application of a Bloom filter mentioned in the text?
Checking if a password is on a list of known weak passwords.
In the context of Bloom filters, what does a false positive mean?
Indicating that a key has been inserted when it has not.
What is one method suggested to improve a simple single-hash-function filter?
Increase the size of the binary array.
What concept does the Bloom filter utilize to optimize searching through large data sets?
A two-stage lookup process.
What is a Bloom filter used for in the context of trying new coffee?
To quickly determine if a new coffee should be sampled based on five attributes.