Data Structures Q & A Flashcards

1
Q

Priority Queue vs. Heap?

A

Priority Queue is an Abstract Data Type, and Heap is the concrete data structure we use to implement a priority queue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the methods for a Priority Queue?

A

A priority queue is a data structure that consists of a collection of items and supports the following operations:

insert: insert an item with a key
delete_min/delete_max: remove the item with the smallest/largest key and returning it

We use an array as the implementation but model it using a complete binary tree which is a heap.

Note that - we only allow getting and deleting the element with min/max key and NOT arbitrary key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the use cases for a Priority Queue? (Heap)

Give a real world case as well.

A

Sorted data structure -> so anything that needs order
It’s good for Top K problems so ranking, get top 10 user scores in a leaderboard, or top 10 liked/upvoted links

Alternatives to those problems of course are sorted sets (hash table + skip list)

A real world case is Hospital triage which is a priority queue. Patients are sorted based on the severity of their condition. For example, a person with the cold comes in and he is placed near the end of the queue. Then a person who had a car accident comes in, and he is placed before the person with the cold even though he came in later because his condition is more severe. Severity is the key in this case.

Coding Problem Examples

Consider a problem like Merge K sorted lists. We need to keep track of min value among k elements (smallest element in each sorted list) and remove the min value and insert new values at will while still having access to the min value at any point in time. Here are some other problems where the priority queue comes in handy.

Merge K sorted lists
k closest pointers to origin
kth largest element
kth largest element in a stream
Median of a data stream (multiple heaps)
Uniform Cost Search
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Python heapq, how to create a heap?

A

Can have a list and use heap.heapify(A) to sort in-place in O(n)

Reference on why: https://python.plainenglish.io/heapify-in-linear-time-114a15487ba1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bucket Sort? Describe bucket sort?

A

O(n) way to use for n length array, where the index represents a “counter” and there needs to be some bound on the array, and the entries in the array themselves could be the “key” such as “Get K top frequent elements” leetcode.

Array where index is bounded by some constant, and the entries are a list of values incase you have multiple items.

https://en.wikipedia.org/wiki/Bucket_sort

The computational complexity depends on the algorithm used to sort each bucket, the number of buckets to use, and whether the input is uniformly distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is constant time complexity?

What are examples of actions

A

O(1) is constant time complexity

Looking up a hashmap entry, or calculating a math formula. Also could be searching a ‘k’ constrained variable is also constant technically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the time complexity of permutations?

A

O(N!)

This is the largest growing time complexity, and often requires memoization to avoid repeated computations and reduce complexity.

Example: Combinatorial problems, backtracking - ordering matters like permutations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the time complexity of finding subsets?

A

O(2^N)

Order doesn’t matter like permutations.

Grows very rapidly. Often requires memoization to avoid repeated computations and reduce complexity.

Combinatorial problems, backtracking, e.g. subsets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is quadratic time?

A

O(N^2)

Also called quadratic time.

Nested loops, e.g., visiting each matrix entry

Many brute force solutions

For small N < 1000, this is actually not that bad for modern computers. In fact, you can probably pass most Leetcode tests with quadratic time for small Ns. However, in an interview, your solution usually has to do better than quadratic time to impress the interviewer if there is a better solution.

2 nested forloops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the time complexity of efficient sorts?

A

O(n log n), where O(n) for merge, and O(log n) for the divide operation in merge sort


Sorting. The default sorting algorithm’s expected runtime in all mainstream languages is N log(N). For example, java uses a variant of merge sort for object sorting and a variant of quick sort for primitive type sorting.

Divide and conquer with a linear time merge operation. Divide is normally log(N), and if merge is O(N) then the overall runtime is O(N log(N)). An example problem is smaller numbers to the right.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Time complexity of using a heap to find top k in a max or min heap?

A

O(k log n)
O(n) to build heap using heapify
k calls to pop with each pop costing log(n) => O(k log n)

Heap push/pop K times. When you encounter problems that look for the “top K elements”, you can usually solve them with pushing and popping to a heap, K times, resulting in O(K log(N)) runtime. e.g., K closest points, merge K sorted lists.
Binary search K times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Name linear time operations?

A

Linear time normally means looping through a linear data structure a constant number of times. Most commonly this means

Going through array/linked list
Two pointers
Some types of greedy
Tree/graph traversal
Stack/Queue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe log time complexity

A

log(N) grows VERY slowly. log(1,000,000) is only about 20. In fact, lookup by primary key in a relational database is log(N) (many mainstream relational databases such as postgres use B-trees for indexing by default, and search in B-tree is log(N)).

In coding interviews, log(N) typically means

Binary search or variant
Balanced binary search tree lookup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Top K problems what algorithm to use?

A

Heaps, but some problems have better ways to improve this to a linear run time like:

  • binary search
  • quickselect
  • bucket sort

But heaps will always be better than a regular sort, but there may be a good case to get a better run time than a heap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

“How many ways” can you do something?

A

Dynamic Programming or DFS (backtracking)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What algorithm when it comes to substring?

A

Sliding window, dynamic or static.

Example: Longest substring without repeating characters (dynamic sliding window)

17
Q

What algorithms for tree structures?

A

They are DAGs (graphs)

So

BFS (level order)
DFS (preorder, inorder, postorder)
Height, LCA, Diameter (max height on both subtrees), and other common actions

18
Q

What algorithm/data structure for “Parentheses”?

A

Stack

19
Q

X Sum problems?

A

Two Pointer

20
Q

Kinds of Two Pointer problems?

A

(1) Fast and Slow Pointers (cycle detection in linked lists)
(2) Starting from both ends (binary search, reverse, palindrome) - called collision with subtype two sum and partition
(3) Forward, with subtype sliding window (dynamic/static) and fast/slow pointers
(4) Parallel - 2 arrays and on pointer for each

Others that can’t be categorized are also in the post

Reference: https://jojozhuang.github.io/algorithm/algorithm-two-pointers/

21
Q

“Subarray” problem algorithm patterns?

A

Prefix sum - subarray sum

Hashmap - continuous subarray sum

22
Q

Max / Longest Sequence algo pattern?

A

Dynamic programming, DFS: Longest increasing subsequence
mono deque: Sliding window maximum

23
Q

“Minimum/Shortest” pattern?

A

DP, DFS, BFS

Dynamic programming, DFS: Minimal path sum
BFS: Shortest path

24
Q

Definition of a BST?

API?

A

A “Binary Search Tree”, or BST, is a binary tree with the following properties:

The value of the left node is less than the value of its parent node.
The value of the right node is greater than the value of its parent node.
The above properties must apply for all nodes in the tree.

An empty tree is a BST since it satisfies the BST properties. The values of each node on a BST can be of any type, as long as they are comparable with each other. For example, you can have an integer BST, a string BST, and so on.

API

find(x) // search, O(log n) // O(h)
insert(x) - O(log n) // O(h)
delete(x)

K, V store basically, ordered by Keys, ordered dictionary or a sorted set is a balanced binary search tree

h = log n, worst case O(n) if not balanced

25
Q

Implement a BST in 30 lines of python
Node
and find(), and insert()

A
class Node:
    def \_\_init\_\_(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right
def find(tree, val):
    if tree is None:
        return False
    if tree.val == val:
        return True
    elif tree.val < val:
        return find(tree.right, val)
    else:
        return find(tree.left, val)
def insert(tree, val):
    if tree is None:
        return Node(val)
    if tree.val < val:
        tree.right = insert(tree.right, val)
    elif tree.val > val:
        tree.left = insert(tree.left, val)
    return tree
26
Q

What is BST used for?

Advantages over a hash table?

A

Used to look up existence of certain objects

Better than sorted arrays as insertion is faster

If you don’t need to dynamically insert new items, then you can simply sort the collection first and use binary search to look up.

Advantages over a hash table include

  1. Hash tables are unsorted, while BSTs are. If you want to constantly maintain a sorted order while inserting, using a BST is more efficient than a hash table or a sorted list.
  2. It’s easy to look up the first element in the BST that is greater/smaller than a lookup value than a hash table.
  3. It’s easy to find the k-th largest/smallest element.
  4. Dynamic hash tables usually have a lot of unused memory in order to make the insertion/deletion time approach O(1), whereas BST uses all the memory they requested.
27
Q

DFS and thinking like a node, as a node, 2 concerns are?

Write DFS algo for Tree template

Two things we need to decide to define the function?

A
  1. Your value
  2. How to get your children
    - ———

function dfs(node, state):
if node is null:

return

left = dfs(node.left, state)
right = dfs(node.right, state)

    ...

return ...

Two things we need to decide to define the function:

  1. Return value (passing value up from child to parent)
    What do we want to return after visiting a node. For example, for the max depth problem this is the max depth for the current node’s subtree. If we are looking for a node in the tree, we’d want to return that node if found, else return null. Use the return value to pass information from children to parent.
  2. Identify state(s) (passing value down from parent to child)
    What states do we need to maintain to compute the return value for the current node. For example, to know if the current node’s value is larger than its parent we have to maintain the parent’s value as a state. State becomes DFS’s function arguments. Use states to pass information from parent to children.
28
Q

What type of problem is this?

“How many ways are there to arrange something”

A

Combinatorial, use DFS (backtracking)

29
Q

What type of problem is this?

“Find all possible combinations of”
“Find all solutions to a puzzle”

A

DFS/backtracking

30
Q

What algo to use on a tree to

Traverse and find/create/modify/delete node
Traverse with return value (finding max subtree, detect balanced tree)

A

DFS - pre order, in order, post order

Can use BFS to traverse tree too thought

31
Q

Algorithm to solve this?

Find a path from point A to B
Find connected components
Detect cycles

A

Find a path from point A to B - DFS, BFS, Shortest Path
Find connected components - DFS and BFS
Detect cycles - DFS, check if we’ve already visited a node we encounter again during traversal, only use BFS for undirected, but go with DFS for cycle detection