Data Structures Q & A Flashcards
Priority Queue vs. Heap?
Priority Queue is an Abstract Data Type, and Heap is the concrete data structure we use to implement a priority queue.
Describe the methods for a Priority Queue?
A priority queue is a data structure that consists of a collection of items and supports the following operations:
insert: insert an item with a key
delete_min/delete_max: remove the item with the smallest/largest key and returning it
We use an array as the implementation but model it using a complete binary tree which is a heap.
Note that - we only allow getting and deleting the element with min/max key and NOT arbitrary key.
Describe the use cases for a Priority Queue? (Heap)
Give a real world case as well.
Sorted data structure -> so anything that needs order
It’s good for Top K problems so ranking, get top 10 user scores in a leaderboard, or top 10 liked/upvoted links
Alternatives to those problems of course are sorted sets (hash table + skip list)
A real world case is Hospital triage which is a priority queue. Patients are sorted based on the severity of their condition. For example, a person with the cold comes in and he is placed near the end of the queue. Then a person who had a car accident comes in, and he is placed before the person with the cold even though he came in later because his condition is more severe. Severity is the key in this case.
Coding Problem Examples
Consider a problem like Merge K sorted lists. We need to keep track of min value among k elements (smallest element in each sorted list) and remove the min value and insert new values at will while still having access to the min value at any point in time. Here are some other problems where the priority queue comes in handy.
Merge K sorted lists k closest pointers to origin kth largest element kth largest element in a stream Median of a data stream (multiple heaps) Uniform Cost Search
Python heapq, how to create a heap?
Can have a list and use heap.heapify(A) to sort in-place in O(n)
Reference on why: https://python.plainenglish.io/heapify-in-linear-time-114a15487ba1
Bucket Sort? Describe bucket sort?
O(n) way to use for n length array, where the index represents a “counter” and there needs to be some bound on the array, and the entries in the array themselves could be the “key” such as “Get K top frequent elements” leetcode.
Array where index is bounded by some constant, and the entries are a list of values incase you have multiple items.
https://en.wikipedia.org/wiki/Bucket_sort
The computational complexity depends on the algorithm used to sort each bucket, the number of buckets to use, and whether the input is uniformly distributed.
What is constant time complexity?
What are examples of actions
O(1) is constant time complexity
Looking up a hashmap entry, or calculating a math formula. Also could be searching a ‘k’ constrained variable is also constant technically.
What is the time complexity of permutations?
O(N!)
This is the largest growing time complexity, and often requires memoization to avoid repeated computations and reduce complexity.
Example: Combinatorial problems, backtracking - ordering matters like permutations
What is the time complexity of finding subsets?
O(2^N)
Order doesn’t matter like permutations.
Grows very rapidly. Often requires memoization to avoid repeated computations and reduce complexity.
Combinatorial problems, backtracking, e.g. subsets
What is quadratic time?
O(N^2)
Also called quadratic time.
Nested loops, e.g., visiting each matrix entry
Many brute force solutions
For small N < 1000, this is actually not that bad for modern computers. In fact, you can probably pass most Leetcode tests with quadratic time for small Ns. However, in an interview, your solution usually has to do better than quadratic time to impress the interviewer if there is a better solution.
2 nested forloops
What is the time complexity of efficient sorts?
O(n log n), where O(n) for merge, and O(log n) for the divide operation in merge sort
–
Sorting. The default sorting algorithm’s expected runtime in all mainstream languages is N log(N). For example, java uses a variant of merge sort for object sorting and a variant of quick sort for primitive type sorting.
Divide and conquer with a linear time merge operation. Divide is normally log(N), and if merge is O(N) then the overall runtime is O(N log(N)). An example problem is smaller numbers to the right.
Time complexity of using a heap to find top k in a max or min heap?
O(k log n)
O(n) to build heap using heapify
k calls to pop with each pop costing log(n) => O(k log n)
Heap push/pop K times. When you encounter problems that look for the “top K elements”, you can usually solve them with pushing and popping to a heap, K times, resulting in O(K log(N)) runtime. e.g., K closest points, merge K sorted lists.
Binary search K times.
Name linear time operations?
Linear time normally means looping through a linear data structure a constant number of times. Most commonly this means
Going through array/linked list Two pointers Some types of greedy Tree/graph traversal Stack/Queue
Describe log time complexity
log(N) grows VERY slowly. log(1,000,000) is only about 20. In fact, lookup by primary key in a relational database is log(N) (many mainstream relational databases such as postgres use B-trees for indexing by default, and search in B-tree is log(N)).
In coding interviews, log(N) typically means
Binary search or variant
Balanced binary search tree lookup
Top K problems what algorithm to use?
Heaps, but some problems have better ways to improve this to a linear run time like:
- binary search
- quickselect
- bucket sort
But heaps will always be better than a regular sort, but there may be a good case to get a better run time than a heap.
“How many ways” can you do something?
Dynamic Programming or DFS (backtracking)