Data Structures Flashcards
What is a dictionary?
An abstract data type made up of keys and values in pairs
What are the 4 main operations of a dictionary?
CRUD: Create, Read, Update, Delete
1. Add a key-value pair
2. Delete a key-value pair
3. Update the value of a pair
4. Find the value of an associated key
Explain why an array is a random-access data structure
Random-access means that you can access an arbitrary element as fast as any other element regardless of where it is in the array.
How is the data of an array stored in RAM?
It’s stored in a contiguous block of memory where every element of the array has a fixed size known up front
Given the offset of the array, element size and element index, how do you calculate the address of the element?
element_address = offset + (element_size * element_index)
Explain how Python implements pointers to store values in arrays
Python uses pointers (integer numbers) in place of actual values in the array which reference the location of where the data is stored. That way, while the pointers are stored continguously for easy access, the actual values (with varying sizes) can be stored anywhere.
What does a hash function do?
Turns any data into a fixed-size sequence of bytes, called a hash value or hash code
T/F: a hash function with always return the same hash value for the same input, even if you exit and reopen Python
False, hash values are only immutable within the same Python session
What is a hash collision, and why is it unavoidable?
It’s when a hash function will yield the same hash value for two different values. It’s unavoidable because the goal of a hash function is to project a potentially infinite number of values onto a finite space (e.g. the pigeonhole principle: Given m items and n containers, if m > n, then there’s at least one container with more than one item.)
T/F: Good hash functions will distribute values evenly across available space
True
T/F: Hash functions distribute values pseudo-randomly
True
T/F: Hash functions are slow
False. They are lightning fast, hence why dictionaries are equally fast
T/F: You can reverse-engineer a hash function’s value back to the original value
False, it’s virtually impossible
T/F: Hash values from the same hash function can be different length numbers
False, they will always have a fixed-size
What makes a hash table’s key-value pairs unordered?
The hash function, because by nature it will try to evenly distribute the key-value pair across the allocated memory space
Briefly explain how to implement the 4 main operations of a hash table, given n
fixed number of memory locations
- Set up a simple method of assigning an index value from the key’s hash value, i.e.
idx = hash(key) % n
- Create: Use the index function on the key, then assign the key-value pair to the list of memory locations
- Read: Use the index function on the key, return the value from the key-value pair
- Update: Use the index function on the key, update the value from the key-value pair
- Delete: Use the index function on the key, remove the key-value pair from the list
Why do hash tables have an O(1) T.C for lookup, insert, update, and delete functions?
Because in order to find/update/add/delete any key-value pair from a hash table, you only need to calculate the index location from the key, e.g. idx = hash(key) % memory_size
as a simple example.
Explain how linear probing is used to solve hash collisions
- Calculate the index of the key of a key-value pair
T/F: For hash tables, you calculate the hash of the value in a key-value pair
False, you have to use the keys of the key-value pairs because they are guaranteed to be unique