Data structure and algorithm Flashcards

1
Q

Name 4 enumeration use cases for backtracking

A
  • subset
  • permutation
  • combination sum
  • palindrome partitioning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What algorithm does backtracking use and what’s a “must have” function for backtracking

A
  • uses recursion
  • uses DFS
  • need a isValid function to check if the solution is valid so far
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two implementations of a graph

A
  • adjacency matrix
    • 2D array of boolean where true indicates an edge exists between the two nodes
  • adjacency list
    • Map<T, List<T>></T>
    • map of a node and all of its neighbors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name 2 diff between graph vs tree traversal

A
  • For graph, need to keep track of “visited”
  • For graph, need to consider a disjoint graph - use an outer while loop
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What data structures are used for BFS vs DFS

A

BFS uses a Queue (class LinkedList)
DFS uses a Stack (class Stack)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe BFS to visit all vertices for a graph (iterative approach)

A
  • prep (before the loop)
    • add starting node to the queue
  • for loop (process the queue)
    1. Look at the head of the queue
    2. if its visited, remove it from the queue, break from the loop
    3. else, add it to “visited”
    4. for all its neighbors, check if they are visited, if not, add to the queue
    5. remove it from the queue
  • stop condition: queue.isEmpty()
  • algorithm is iterative not recursive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe DFS for a graph (iterative approach)

A
  • push starting node into stack
  • while stack is not empty
    • pop node from stack
    • mark the node visited
    • for all neighbors of the node
      • check if visited, if not, push to stack
  • stop condition: stack is empty
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe DFS for graph using recursion and some things to note

A
procedure dfs(vertex v) {
    visit(v);
    for each neighbor u of v
        if u is undiscovered
            call dfs(u);
}
  • Note this implementation does not handle disjoint graph
  • For disjoint graph, we need a separate DFS for each connected graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name 4 DFS applications (1 of which is not studied so it’s ok to only name 3)

A
  • detect cycle in graph
  • path finding
  • finding strongly connected components of a graph
  • backtracking
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Tradeoffs between DFS and BFS (incomplete)

A

BFS
- can consume a lot more memory when the branching factor of a tree is huge

DFS
- can take a long time to visit other neighboring nodes if the depth of the tree is huge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Advantage of using adjacency list over adjacency matrix

A
  • adjacency list uses less space, better for sparse matrix
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Implement permutation using backtrack

A

Highlights (compare to the subsets)
- add to solution only if the length matches
- no need to have a start pointer
- need to check if the current item is already in visited

/**
     * This implementation does not handle the input if it contains duplicate values
     * input: 1, 2, 3,
     * output: [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]
     *
     * https://leetcode.com/problems/permutations/
     *
     * @param nums
     * @return
     */

    public List<List<Integer>> permute(int[] nums) {
        List<List<Integer>> solution = new ArrayList<>();
        permuteBacktrack(solution, new ArrayList<>(), nums);
        return solution;
    }

    private void permuteBacktrack(List<List<Integer>> solution, List<Integer> visited, int [] nums) {

        if(visited.size() == nums.length){
            solution.add(new ArrayList<>(visited));
        } 

        for(int i = 0; i < nums.length; i++){
                if(visited.contains(nums[i])) continue; // element already exists, skip
                visited.add(nums[i]);
                permuteBacktrack(solution, visited, nums);
                visited.remove(visited.size() - 1);
        }
    }
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Implement subsets using backtrack

A

Algorithm highlights:
- because we use recursion, any temp data structure are passed in as args instead of return values
- args are: solution, visited (list), input (array), start pointer
- have a for loop to iterate thru the input array
- visit the current item
- recursion with start++
- remove the last item in visited
- essentially “visited” is used as a “stack”

    public List<List<Integer>> subsets(int[] nums) {
        List<List<Integer>> solution = new ArrayList<>();
        subsetsBacktrack(solution, new ArrayList<>(), nums, 0);
        return solution;
    }

    private void subsetsBacktrack(List<List<Integer>> solution , List<Integer> visited, int [] nums, int start){

        solution.add(new ArrayList<>(visited));

        for(int i = start; i < nums.length; i++){
            visited.add(nums[i]);
            subsetsBacktrack(solution, visited, nums, i + 1);
            visited.remove(visited.size() - 1);
        }
    }
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Find substring problem - high level description of the approach

Problem: find a min substring in s that satisfies t

 * Input: s = "ADOBECODEBANC", t = "ABC"
 * Output: "BANC"
A
  • use a “two pointer” approach by using start/end pointer to iterate thru “s”
  • outer while loop to “expand” the window, move “end” pointer
  • outer loop ends when “end” points at last char in “s”
  • inner while loop to “shrink” the window, move the “start” pointer
  • “shrink” until the “window” no longer satisfies “t”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Find substring problem - what are the data to keep track of

Problem: find a min substring in s that satisfies t

 * Input: s = "ADOBECODEBANC", t = "ABC"
 * Output: "BANC"
A
  • a char map - to keep track of whats matched
  • start/end pointers
  • minStart/minLength
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Implement isPalindrome

A
public boolean isPalindrome(String s, int low, int high){
        while(low < high)
            if(s.charAt(low++) != s.charAt(high--)) return false;
        return true;
    }

Important points
- use s.charAt()
- while (low < high), not <= because no need to compare the middle char if length is odd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Implement the template for binary search
- how to compare left and right (< or <=)
- mid rounds up or down, in java, what method to call
- how to set mid pointer
- how to set left and right after each iteration
- after exiting the loop, which pointer do we return

A
  • while loop is left < right
  • mid rounds down, In Java, use Math.floorDiv(int x, int y) -> int
  • mid is left + Math.floorDiv((right-left), 2)
  • left = mid + 1, right = mid
  • return left
  • after exiting the while loop, return left or right depending on if need to find min or max
def binary_search(array) -> int:
    def condition(value) -> bool:
        pass

    left, right = min(search_space), max(search_space) # could be [0, n], [1, n] etc. Depends on problem
    while left < right:
        mid = left + (right - left) // 2            # means round down
        if condition(mid):
            right = mid
        else:
            left = mid + 1
    return left
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Name 3 basic applications for binary search

A

Basic applications

  • find the first “bad” version (like git bisect)
  • compute square root
    • condition is k^2 > x, then k-1 is the answer
    • right = x+1 to deal with edge conditions like x = 0 and x = 1
  • search insert positions in a sorted array
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When can we use binary search
Give a sample problem of an advanced application
In this sample problem, what are the min and max values for the search space

A

When we can identify a search space.

Sample problem - determine the min ship weight to ship all the shipments within x days.

Input: weights = [1,2,3,4,5,6,7,8,9,10], days = 5

Approach
- search space is left = max(weights), right = sum(weights)
- condition function returns true if it’s possible to ship everything within D days with a given capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the variations of the sliding window problem

A

“max” window
- the initial condition is valid
- goal is to find the “max” window
- example: find max non repeated chars in a string

“min” window
- the initial condition is invalid
- goal is to find the “min” window where it’s valid
- example: find the substring problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the two implementations of the “max window” problem

A

“shrinkable” solution
- right pointer expands until the window is invalid
- then shrink the left pointer until it’s valid again
- keep track of the “max” result

“non shrinkable” solution
- based on the observation that once we find a “max” value so far, then there is no need to shrink it
- we “shift” the window by incrementing “left” pointer when the window is invalid
- note there is no inner loop, it’s an “if” statement instead
- this essentially “shifts” the window (without increasing the size) when it’s invalid, thus keep track of the “max result” seen so far

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Some highlights of the sliding window implementation for both the “max”and the “min” variations

A

common
- outer loop is a for loop, “right” ptr moves 1 step every time
- inner loop is a while loop, “left” ptr moves 1 step while its valid or invalid
- the variation between problems is how to keep the “state”

“max window”
- because the window starts as valid, the outer loop expands the window while state is valid
- the inner loop shrinks the window until it’s invalid

“min window” is the opposite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

In union find, what’s the difference in the data stored in the “root” data structure between the quick find and quick union implementations

A
  • quick find: it stores the root of the node
  • quick union: it stores the parent of the node
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Whats union find and what are the 4 common operations

A
  • union find performs operation on disjoint data sets
  • “find” - returns the root node given a node
  • “union” - joins the subtree containing x and y nodes
  • “init” - inits the disjoint data sets by setting each node as a disjoint graph
  • “isConnected” - returns true if x and y are connected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are some variation of implementation of union find

A

Quick find - optimizes on “find”
- the root array stores the “root” of each element

Quick union - optimizes on “union”
- the root array stores the “parent” of each element

Union by rank - an optimization for “quick union”
- all function implementations are the same as “quick union” except for the “union” function
- optimize the worst case scenario in “quick union” where the tree is a single line (height of tree == num of elements)
- uses an additional array “rank” to store the height of the tree
- optimizes by making the shorter of the two trees a subtree of the taller tree, thus not increasing the height of the new tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Union find - quick find - why is the union operation O(n).

A

Because when we join two disjoint graphs A and B, we need to update the root node for all nodes in B to be the root node of A.

Even if we figure out which subset has less nodes, in the case where A and B are about the same size, the runtime complexity is O(1/2 n) or O(n).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Union find, quick find
Runtime complexity for each operation

A

Quick find:
- find: o(1)
- union: o(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What data do we need to store to implement union find

A
  • a root array storing either the root node or the parent node (quick find vs quick union)
  • in the case of non int values, can use a map
  • for “union by rank”, will also have a rank array to store the height of each element, init all elements to 1.
    • only the rank[root] are read and updated in the “union” function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Whats the implementation for each union find functions, for the quick union variation

Also point out the difference for union by rank

A
  • init
    • init root
    • for “union by rank”, init rank to be all 1
  • find(x) - returns the root node of x
    • loop the parent of x until it equals itself
  • union (with union by rank)
    • find root of x and y, if they are the same, skip
    • compare the rank of rootX and rootY
    • set the new root to be the root of the taller tree
    • if they are equal, the height of the tree is incremented by 1
  • isConnected
    • return find(x) == find(y)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

When using DFS for graph, when to use recursion and when to use iteration? Give some samples problems for each.

A

Iteration
- when pushing the next node onto the stack
- example, find one path between A and B

Recursion
- when push a path onto the stack
- example: find all possible paths between A and B
- back tracking

31
Q

What are the things to watch out for DFS in a 2D grid and how to find neighbors?

A
  • be sure to check if the node is an edge node to avoid index out of bound
  • i >= 0 && i < board.length
  • j >=0 && j < board[i].length
32
Q

In DFS recursion
For the “find all paths between A and B” problem, what’s special about the “visited” array and the data that we push onto the stack. What do we do after dfs returns from recursion and why.

A
  • The “visited” array keeps track of the path we are exploring at the moment
  • To find out if we need to visit a node or not is to see whether it’s already on the path
  • We push the “path” onto the stack, but since we are doing recursion (where the stack is the call stack), it just means we keep it on the argument list.
  • When we return from recursion, we always remove the last node from the “path”. This is backtracking. Why only remove one node, because with each dfs step, we move fwd one node, thus we remove one node when we are done.
33
Q

How do we break down a recursive function into 3 high level steps, use the “find all paths” problem as an example.

A

Before recursion
- process the current node, e.g. add to path, mark it visited
- check for stop condition and return if stop condition met

What to recurse on
- all unvisited neighbors of the current node
- set the “starting point” to each unvisited neighbors

After recursion
- remove the current node from the path

34
Q

When we use recursion, where is the stack?
How do we push data onto the stack.

Explain with the “find all paths between A and B” problem

A
  • When using recursion, we use the function call stack as the stack to push/pop data from.
  • We push data onto the stack by passing it as an argument to the function
  • Each time we call the recursive function, we push data onto the stack.
  • In the “find all paths between A and B” example, we add one more node onto the path. But because the data is shared (because List<String> path is passed by reference), we need to remove the last item from the stack after recursion exits.</String>

Note - Recursion is used on problems that normally need a stack (but not all stack problems should use recursion).

35
Q

Whats args to pass onto dfs for “find all paths” problem

A
  • the starting node
  • one path (visited data structure, can use LinkedHashSet to both keep the order of insertion and easy lookup
  • all paths - the solution
36
Q

Practically speaking, does it make sense to use recursion for BFS

A

No, because using recursion essentially adds a stack (the call stack) to the solution. BFS uses a queue not a stack and therefore it’ll awkward to use recursion.

37
Q

Cycle detection:
1. What algorithm do we use
2. Explain the algorithm

A
  1. Use DFS recursion
  2. Algorithm
    - stop condition is the “next node” is already on the “path”
    - add the current node on to the path
    - calls dfs for all unvisited neighbors (where “visited” means on the “path”)
    - after recursion exists, remove the last node from the “path”

Note.
- We push node onto the stack. Not a path
- That means we can also use DFS iterative approach.
- Visited is also the path. No separate visited set is needed
- Mark the node visited when it’s added to the path

38
Q

What’s the name of the algorithm for topological sort

A

Kahn’s algorithm

39
Q

Whats the time complexity for Kahn’s algorithm

A

O(V + E)

40
Q

Is Kahn’s algorithm BFS or DFS

A

BFS and it uses a Queue

41
Q

What are edge conditions to consider for Topological sort and how to find out about each

A
  • There is no starting node - no node with in-degree of 0 - after all the in-degrees are calculated, there is nothing added to the Queue
  • There is a starting node but there is a cycle elsewhere in the graph - we are done processing all nodes in the Queue, but not all nodes were processed (visited.length != numCourses)
42
Q

Kahn’s algorithm is normally used for topological sort, but in that process, it can be used for something else (almost as a by-product), what is it.

A

To detect if the graph contains cycles, but its less efficient than DFS

43
Q

Describe Kahn’s algorithm

A
  • Calculate in-degrees and out-degrees for each node
  • If any node has in-degree of 0, add to the Queue
  • edge condition check: if the Q is empty, there is no “starting node”
  • While queue is not empty,
    • remove the node from the queue
    • mark the node “visited”
    • for each node in its out-degree, subtract their out-degree by 1
    • if the neighboring node’s out-degree becomes 0, add it to the Queue
  • if visited.length != numNodes, there is a cycle in the graph
44
Q

What are some applications for BFS vs DFS for graph traversal

A

Common
- find if path exists between A and B (iterative dfs, faster with BFS)
- find all paths between A and B (recursive dfs)
- traverse all vertices (iterative dfs)

DFS
- cycle detection (recursive dfs)
- backtracking (recursive dfs)

BFS
- topological sort (Kahn’s algorithm, can also use to find cycle)
- shortest path
- graph coloring/bipartition

45
Q

Describe the BFS algorithm to find if a path exist between A and B

A

Algorithm
- populate an adjacency list
- push A into the queue, mark it visited
- while queue is not empty
* remove from the queue
* if the node is B, then return true
* add all unvisited neighbors to the queue
- return false after the while loop exits

Highlights
- the queue saves a node (not a path)
- we need keep track of “visited”
- mark node as “visited” when they are added to the queue (not when they are removed from the queue)

46
Q

Describe the BFS algorithm to find all paths from A to B

A

Algorithm
- push A onto the queue, this is a path of size 1
- while queue is not empty
* pop the path from the queue
* observe the last node from the path
* for all its neighbors, push a new path with the neighbor
* if the neighbor is B, add the path to the result

Highlights
- the queue saves a path, not a node
- no need to keep track of “visited” because we already save the path

47
Q

Whats the time complexity for “find if path exist” for both BFS and DFS

A

They are both O(V+E)

48
Q

Whats the time complexity for “find all paths” for both BFS and DFS

A

BFS: O( 2^V * V)
DFS: ( (V-1)! )

BFS is much better
For v=10,
BFS ~ 10k
DFS ~ 362k

49
Q

Describe the shortest path algorithm using BFS

A

Highlight
- the queue saves the path
- we mark the node “visited” AFTER all its visited neighbors are added to the path

Algorithm
- add a path of 1 onto the the queue, it contains just node A
- while queue is not empty
* pop the last path
* examine the last node in the path
* for each unvisited neighbor of the last node, create a new path, and add to the queue
* mark the last node “visited”
* if the last node is B, return the size of the new path, this is the shortest path

50
Q

Using BFS, why is the first path to B discovered is the shortest path

A
  • Using BFS, we are visiting nodes in “layers” from A
  • We see all nodes with 1 distance from A before we see any node with 2 distances
  • That’s why the first path discovered with B is the shortest path
51
Q

Why do we need to keep track of “visited” for “shortest path” but not for “all paths” using BFS

A

For shortest path, we visit nodes in “layers” from A. If there is a path A C E D, and another path A D E, we don’t need to know about ACED because we get to D faster from A. Therefore, we mark D visited from the AD path rather than from the ACED path.

For “all paths”, because we keep track of the path in the queue, we don’t need a separate “visited” set. We can find out if a node is already on the path. For “all paths”, ACED is a diff path from AD.

52
Q

Whats a bipartite graph

A

A graph is biparte if it can be div into two distinct set of nodes where there is no edges between nodes within the same set. All edges are between the “sets”.

53
Q

What are the 4 properties of a bipartite graph

A
  • never contain “odd cycles”, e.g. a cycle with odd number of nodes
  • subgraphs of a bipartite graph is also bipartite
  • a bipartite graph is always 2 way colorable, and vise versa
  • In an undirected bipartite graph, the degree of each vertex partition set is always equal.
54
Q

Describe the algorithm to find if a graph is bipartite.

A
  • The graph could be disjoint, use an outer loop to iterate thru all nodes. If a node is “uncolored”, run it thru BFS.
  • BFS starts with an uncolored node, and returns a boolean to indicate if the graph is bipartite.
    • Mark the starting node with an arbitrary color, add to the queue.
    • while queue is not empty, pop the node
    • Check if any neighbor is of the same color, if so, return false (we found a conflict).
    • Otherwise, mark the neighbor the “opposite” color.
    • When the BFS exits, the entire connected graph is processed.
    • BFS uses “colored” to keep track of “visited”.
55
Q

Explain the pre-order, in-order, and post-order traversals for a tree.

A
  • pre order: node, left subtree, right subtree
  • in order: left subtree, node, right subtree. This is used to display values in non decreasing order for a BST.
  • post order: left subtree, right subtree, node
56
Q

Which of the pre-order, in-order, and post-order traversals use DFS and which uses BFS

A

All 3 uses DFS

We can use either the recursive approach or iterative approach for all 3.
- pre/post order iterative approach can be implemented with 1 stack
- in order iterative approach - I have not look it up yet, heard it uses 2 stacks

57
Q

What happens to pre/in/post order traversals when the tree is not a binary tree

A
  • For pre-order and post-order, since L and R are traversed together, it means visit all the children before/after the node.
  • For in-order, we need to define which children should be visited before the node.
58
Q

Describe the recursive approach for pre/in/post order traversal

A

Recursive is easier compare to iterative.

For in-order:
- visit the left tree
- visit the node
- visit the right tree

For pre/post order, just change when to visit the node.

Stop condition is if the node is a lead node.

59
Q

Describe the iterative approach for pre-order

A

Pre-order is the easier one because we “visit” the node first.
Because its DFS, we use a stack.

Steps
- add root to stack
- while stack not empty
- pop node, mark it visited (add to output)
- add children R to L.

To “visit” a node, just add it to the output.
We pop the node when it’s “visited”

60
Q

Describe the iterative approach for post-order

A

Post order is slightly more complicated than pre-order because we can’t “visit” the node until both children are “visited”.
“Visit” means to pop the node from stack and add to the output

Steps
- add root to stack
- while stack is not empty
- peek the node (don’t pop yet)
- pop it only if both children are “visited”
- push R subtree to stack
- push L subtree to stack

61
Q

Whats the time complexity to in/pre/post order traversal

A

O(n) since we visit all nodes exactly once.

62
Q

What is a trie

A
  • tree data structure to store strings
  • all strings sharing the common prefix comes from a common node
63
Q

Define a TrieNode (including fields and constructor)

A

Fields:
- Map<Character, TrieNode> edges - where Char is the value of the edge
- boolean isEndNode

Constructor
- new HashMap()
- isEndNode = false

64
Q

How is a TrieNode different from a TreeNode or GraphNode

A
  • a TrieNode has no value, the values are on the edges
  • TrieNode needs to keep track whether it’s an end node
65
Q

Name 4 operations for Trie

A
  • insert
  • search
  • startsWith
  • delete - less common
66
Q

Advantages of using trie

A
  • Keys are searched using common prefixes, hence faster.
  • take less space because the common part of the string is only stored once
  • help with longest prefix matching
67
Q

How does a trie compare to a hash table

A
  • trie is faster in worst case compare to imperfect hash table
  • trie no hash function thus no collision
  • trie is not available in programming tool, hence implementation is always done from scratch
68
Q

Name two applications for trie

A
  • build dictionaries for english
  • spell check
69
Q

Describe the insert operation for trie

A
  • convert the word to lower case
  • use a var to keep track of the last TrieNode
  • walk the down trie, if the edges does not contain the char to be added, then add the edge, otherwise just keep walking
  • set the “isEndNode” to true for the last char
70
Q

Describe the search operation for trie

A
  • convert the word to lower case
  • use a var to keep track of the last TrieNode
  • as we walk down the trie, if we cannot find an edge, return false
  • if we are able to walk all the way down to the last char AND if the last char isEndNode is true, return true.
71
Q

In union find, what’s the difference in the data stored in the “root” data structure between the quick find and quick union implementations

A
  • quick find: it stores the root of the node
  • quick union: it stores the parent of the node
72
Q

Describe the difference between startsWith and search for a trie

A

They are the same until the last step.
- for startsWith, no need to check if the last node’s “isEndNode” is set to true
- for search, we need to check

73
Q

Union find, quick union
Runtime complexity for each operation

A

Quick union:
- find: o(n)
- union: o(n)

74
Q

Union find, union by rank
Runtime complexity for each operation

A

Union by rank
- find: o(log n)
- union: o(log n)