CS6515_Exam2 Flashcards

1
Q

If you are given an UNDIRECTED graph you can only find ________ _________?

A

connected components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If you are given a DIRECTED graph, you can only find ________ ________ _______?

A

strongly connected components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When given an undirected graph, one of the best things to do first is look for strongly connected components? (T/F)

A

False. The concept of strongly connected components doesn’t exist for undirected graphs.

In the 06/15 Office Hours, Joves’ mentions big penalties will be applied for things like this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you find sources and sinks on an undirected graph?

A

Trick question. Sources and sinks don’t exist on undirected graphs. They only make sense in the context of directed graphs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some important properties/characteristics of Directed Acyclic Graphs (DAGs)?

Hint: DPV textbook mentions there are 3 of them

A
  1. A directed graph has a cycle if and only if its DFS reveals a back edge
  2. In a DAG, every edge leads to a vertex with a lower post number
  3. Every DAG has at least once source and at least one sink
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a source node?

A

A node with in-degree of 0, i.e. no incoming edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a sink node?

A

A node with out-degree of 0, i.e. no outgoing edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

If your graph has no weights, then you can’t calculate a minimum spanning tree? (T/F) Why or why not?

A

This is really a trick question, the answer can be True or False, depending on perspective. In reality, MSTs only really make sense on a weighted graph. For an unweighted graph, the concept of “minimum” isn’t well-posed, so technically ANY spanning tree on an unweighted graph is minimal, but it’s essentially a useless definition (consider that it would also be a minimum spanning tree == maximal spanning tree in the unweighted case; a distinction without a difference).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 9 algorithms (at minimum) that we are allowed to use as black-boxes for HW/Exams?

A
  • Dijkstra’s
  • BFS
  • DFS
    • Explore (subroutine of DFS, but allowed to use as standalone algo)
  • Strongly connected components (SCC) [Joves’ mentions this one as being especially important]
  • 2SAT Algorithm
  • Kruskal’s Algorithm
  • Prim’s MST Algorithm
  • Anything else mentioned in the Lectures and/or Textbook

Source: 06/15 Office Hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

We should NEVER modify the black-box graph algorithms for the HW/Exam? (T/F)

A

True. Joves’ emphasizes this multiple times in 06/15 Office Hours.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The strongly connected components algorithm is analogous to what other algorithm?

A

SCC uses multiple runs of DFS, with some modifications to the input and output variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

DFS/BFS do NOT consider edge weights? (T/F)

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the runtime of DFS/BFS?

A

O(n + m), i.e. one of the fastest options we have, but with the caveat that if we need to consider edge weights, then that adds additional time complexity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the runtimes of all the key black-box graph algorithms?

A
  1. BFS: O(n + m)
  2. DFS: O(n + m)
  3. Dijkstra’s: O((n + m) * log n)
  4. Explore: O(m) if part of DFS; O(n + m) if standalone
  5. Topological Sort : O(n + m)
  6. SCC: O(n + m)
  7. 2-SAT: O(n + m)
  8. Kruskal’s MST Algo: O(m log n)
  9. Prim’s MST Algo: O(m log n)
  10. Ford-Fulkerson: O(mC)
  11. Edmonds-Karp: O(nm^2)
  12. Bellman-Ford: O(nm)
  13. Floyd-Warshall: O(n^3)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

We should ALWAYS use adjacency list representation for graphs in this class, and NEVER use adjacency matrices? (T/F)

A

True. Joves’ mentions in office hours that the gains from using the matrix form are minimal, and if implemented incorrectly can cause runtimes to massively explode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the runtime to index into a vertex in a graph?

A

For this class, we assume O(1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the runtime to find an edge in a graph? Why?

A

O(m). Because we have to scan over the edges in the adjacency list up to m edges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

For exams, we need to explicitly define that n = |V| and/or that m = |E|? (T/F)

A

False. Joves’ says in 06/15 Office Hours we can just assume this as understood without explicitly writing down.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

To traverse a graph or find all edges, what is the runtime?

A

O(n + m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why is traversing a graph or finding all the edges O(n + m) and not O(nm)?

A

Because if it was O(nm), that would imply that EVERY vertex is connected to EVERY edge. This is obviously not the case. Also, the runtime for BFS/DFS is O(n + m), which allows us to traverse all the edges in that amount of time.

Joves’ mentions a lot of students make this mistake.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

As mentioned in lectures/OH/textbook, what is “Property 1” of trees?

A

Removing a cycle edge cannot disconnect a graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

As mentioned in lectures/OH/textbook, what is “Property 2” of trees?

A

A tree on n nodes has n - 1 edges (i.e. the definition of a tree)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

As mentioned in lectures/OH/textbook, what is “Property 3” of trees?

A

Any connected, undirected graph G = (V, E) with m = n - 1 edges is a tree. (This is the converse of Property 2.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

As mentioned in lectures/OH/textbook, what is “Property 4” of trees?

A

An undirected graph is a tree if any only if there is a unique path between any pair of nodes.

If there is more than one path between any pair of nodes, then there is a cycle

If there is no path between any pair of nodes, then it is not connected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the “Cut Property”?

A

The Key Idea: Any edge which is minimum weight across a cut is going to be part of SOME MST.

More Precisely:
Considers UNDIRECTED graph G=(V,E)

  1. Take a subset of edges X ⊂ E, where X ⊂ T, where T is a partial solution (i.e. it is part of some MST) that we assume is correct so far
  2. Take a subset of vertices S ⊂ V where no edge is in the cut (S, S’)
  3. Look at all edges of G in cut (S, S’)
  4. Let e* be min weight edge in cut (S, S’)
  5. Then: X ∪ e* ⊂ T’, where T’ is a new MST

See DPV 5.1.2, also lecture video GR3_9-10.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

TODO: Finish office hours for 06/15

A

TODO: Start watching around the 1hr 15min mark.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are some important abstract data types and their associated data structures?

A

Abstract Type —> Data Structure

  • Stack —> Array, Linked List
  • Queue —> Array, Linked List
  • Priority Queue —> Heap
  • Dictionary / Hashmap —> Array

Source: Udemy Algorithms and Data Structures in Python lectures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the runtime of accessing the i’th element in an array/list?

A

O(1). This is because items are stored adjacent to one another in RAM.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is a “Tree Edge”?

A

Anytime we visit a new vertex via an edge, we form a Tree Edge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is a “Forward Edge”?

A

An edge that goes from a node to a DESCENDANT in the tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a “Backward Edge”?

A

An edge that goes from a node to an ANCESTOR in the tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is a “Cross Edge”?

A

Any other edge that isn’t a Tree, Forward, or Backward edge, i.e. non-ancestry relationship.

(I think thinking of them as siblings or cousins is a helpful way of distinguishing, because neither of the nodes are descendants/ancestors of one another in a cross-edge relationship.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

You have a directed graph G and see an edge e=(zw), and that w is a direct child of z. You run DFS and see that the post(z) > post(w); what type of edge is this?

A

Tree edge (or Forest Edge, more generally)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

You have a directed graph G and see an edge e=(zw). You run DFS and see that the post(z) < post(w), what type of edge is this?

A

Back edge

Note: This was actually one of our poll questions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

If the post order number is going DOWN, what type of edge(s) might this be?

A

Tree edge, forward edge, or a cross edge. If the post order number is going down and the relationship is a direct parent/child, then it’s a tree edge. Otherwise it must be skipping down generations to a descendant, in which case it is a forward edge. If there’s no ancestor relationship (i.e. the nodes are siblings or cousins), then it is a cross edge).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

If the post order number is going UP, what type of edge(s) might this be?

A

Back edge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is a tree?

A

A tree is an undirected graph that is connected and acyclic.

See DPV pg. 129 for good info on tree properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

If we have an undirected graph that is connected and acyclic, what is this called?

A

A tree.

See DPV pg. 129 for good info on tree properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

A tree on n nodes has how many edges?

A

n - 1

See DPV pg. 129 for good info on tree properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

If we have a connected, undirected graph G=(V,E) with |E| = |V| - 1, what type of graph is this?

A

A tree.

See DPV pg. 129 for good info on tree properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

An undirected graph is a tree i.f.f. there is a unique path between any pair of nodes? (T/F). Why or why not?

A

True. In a tree, any two nodes can only have one path between them; for if there were two paths, the union of these paths would contain a cycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

We can tell whether a connected, undirected graph is a tree or not just by counting how many edges it has? (T/F)

A

True. If it has n - 1 edges, then this must be the case.

See DPV pg. 129 for good info on tree properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

When doing topological sort, we order the vertices by ________ _________ order number

A

decreasing; post

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is the main idea of Dijkstra’s algorithm?

A

It’s a greedy algorithm. It attempts to find the global optimum by with the aid of the local optimum. Given an input node, Dijkstra’s algorithm initially assumes an infinite distance between other nodes, slowly iterates and updates each node with the local best path and the best path found so far.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Dijkstra’s algorithm at start assumes an infinite distance between nodes? (True/False)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is one of the principle advantages of the Bellman-Ford algorithm over Dijkstra’s algorithm?

A

Bellman-Ford can handle positive and negative edge weights, whereas Dijkstra’s can only handle positive edge weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What data structure is used for Dijkstra’s algorithm?

A

A heap (in this class we typically assume min-heap)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Bellman-Ford algorithm is slower than Djikstra’s algorithm? (T/F)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is one of the key ideas of the Bellman-Ford algorithm?

A

The idea of “Relaxation” of edge weights. It does this for all edges in the graph at the same time for V - 1 iterations (and then it does one more iteration to detect negative cycles). This gives us a runtime complexity of O(n*m).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is the runtime of the Bellman-Ford algorithm?

A

O(n*m), where n = |V| and m = |E|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

When running the Bellman-Ford algorithm, if we’ve run V - 1 iterations, and then run a final V’th time and the cost DECREASES, what can we conclude?

A

There MUST be a negative weight cycle present.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Dijkstra’s algorithm is a _______ approach, whereas Bellman-Ford algorithm is a _______ approach?

A

greedy; dynamic programming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

How many iterations are there in the Bellman-Ford algorithm? How many are there if we include checking for negative weight cycles?

A

|V| - 1, i.e. m - 1 if we don’t check for negative weight cycles. If we do check, then we simply perform one more iteration, so a total of |V| iterations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Bellman-Ford’s algorithm can only handle positive edge weights? (T/F)

A

False. It can also handle negative edge weights, which is one of it’s principle advantages over Dijkstra’s algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Dijkstra’s algorithm can only handle positive edge weights? (T/F)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What are some algorithms for finding a path between two vertices in a graph such that the same of the edge weights are minimized (i.e. shortest path algorithms)?

A
  1. Dijkstra’s algorithm
  2. Bellman-Ford algorithm
    3 A* Search
  3. Floyd-Warshall algorithm

Note: I don’t think A* is mentioned in the lectures or textbook, so I don’t think it is available as a black box to us.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What type of data structure does BFS use?

A

Queue, i.e. FIFO, like a line at Disney World

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What type of data structure does DFS use?

A

Stack, i.e. LIFO, like a stack of dishes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What is the SPACE complexity of BFS vs. DFS?

A

BFS: O(n), because we have to use a queue that ends up upper bounded by the number of leaf nodes.

DFS: O(log n), because we use a stack data structure that only has to maintain up to the HEIGHT of the tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

In general, which type of graph search is preferred, DFS or BFS? Why? What are some exceptions?

A

Generally DFS, because it in general has better space complexity and has some other nice properties. This doesn’t mean we should avoid BFS; but I think for this class we’re likely to see more usage of DFS, whether on its own or under the hood of other algorithms (e.g. SCC, etc).

AI/robot motion is a good example of an exception to this where we might prefer BFS over DFS. This is because BFS goes layer-by-layer, so it can be better for discovering the LOCAL ENVIRONMENT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What are the input(s) for BFS?

A
  1. Graph G = (V, E), directed or undirected

2. Source vertex s in V

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

What are the output(s) for BFS?

A
  1. dist[u]: array containing distance from source s to u IF s can reach u; inf otherwise
  2. prev[z]: array containing the parent index of vertex z

Source: Joves’ notes, HW4 practice problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What is the runtime of BFS?

A

O(n + m), i.e. linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

In BFS, the dist array contains the sum of the weights on the path from source s to destination u? (T/F)

A

False. dist[u] is the given as the NUMBER OF EDGES from s to u, NOT the sum of the weights. Very important to remember this.

Source: Joves’ notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

What are the input(s) to Dijkstra’s Algorithm?

A
  1. Graph G=(V, E), directed or undirected, with POSITIVE EDGE WEIGHTS c_e in E
  2. Source vertex s in V
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What are the output(s) of Dijkstra’s Algorithm?

A
  1. dist[u]: array containing distance from source s to u IF s can reach u; inf otherwise
  2. prev[z]: array containing the parent index of vertex z (can be used to reconstruct the shortest path)

Source: Joves’ notes, HW4 practice problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

What is the runtime of Dijkstra’s algorithm?

A

O( (n + m)*log n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

BFS can be used to determine the weighted Single Source Shortest Path (SSSP)? (T/F). Why or why not?

A

False. Because the “shortest path” given by BFS is in terms of the NUMBER OF EDGES, and NOT the edge weights. We would need to use something like Dijkstra’s algorithm for weighted edges (assuming the edge weights are positive).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Dijkstra’s algorithm can be used for non-negative, weighted Single Source Shortest Path (SSSP)? (T/F)

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

What is one connection between BFS and Dijkstra’s algorithm?

A

They can both be used for Single Source Shortest Path (SSSP) determination. However, BFS ONLY works for unweighted graphs, whereas Dijkstra’s can handle (positive) weighted edges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

If BFS and Dijkstra’s Algorithm can both be used for SSSP determination, why is the runtime for BFS O(n + m), but the runtime for Dijkstra’s Algorithm O( (n + m)*log n) ?

SSSP: Single Source Shortest Path

A

Because BFS can only handle SSSP for UNWEIGHTED graphs, whereas Dijkstra’s handles graphs with (positive) edge weights. If you’re doing extra work to process weights, then you can’t achieve linear O(n + m) runtime, hence the difference between the two.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

What are the input(s) for DFS?

A
  1. Graph G=(V, E), directed or undirected

Source:

Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What are the output(s) of DFS?

Hint: there are 4, and they are all arrays

A
  1. array prev[z]: the parent index of vertex z in the DFS visitation
  2. array pre[z]: the pre-order number of vertex z in the DFS visitation
  3. array post[z]: the post-order number of vertex z in the DFS visitation
  4. array ccnum[z]: the connected components number of vertex z. Can also be the strongly connected component number (SCC) IF the vertices are passed in topologically sorted, i.e. as highest post number –> lowest post number AFTER running DFS on a REVERSED DIRECTED GRAPH. For a directed graph that is processed this way, the ccnum will also be the topological sort order in reverse.

Source:

Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What is the runtime of DFS?

A

O(n + m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

DFS can be used in an undirected graph to determine how many components there are? (T/F)

A

True. This would come from the max value for the ccnum

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

DFS can be used in an directed graph to determine how many components there are? (T/F)

A

False. This only works for UNDIRECTED graphs. For a directed graph, you would need to use something like SCC to do this.

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

How do pre/post order numbers behave on a DIRECTED graph?

A

They give information on how a graph COULD be explored. Some numbers might be interchangeable, others might not (see DPV 3.3(d) for a good example f this).

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

How do pre/post order numbers behave on an UNDIRECTED graph?

A

They give information on how a graph WOULD be explored GIVEN a particular starting point. A different root node might give an entirely different pre/post order numbering.

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

In DFS, the ccnum in a directed graph only works if you always explore from a sink vertex if all previous sink SCCs were removed? (T/F)

A

True. Keep in mind that a source vertex will be able to reach all connected vertices and will thus be more or less useless.

TODO: Need to review the SCC section again, as this still isn’t totally clear to me.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

A DAG will only have one valid topological ordering? (T/F).

A

False. See DPV 3.3(d) for a good example of why this is the case. TL;DR version: post order numbering on a DIRECTED graph provides information on how a graph COULD be explored; it might admit multiple valid topological orderings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

A DAG will have at least one valid topological ordering? (T/F)

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

A DAG will have at least one valid topological ordering? (T/F)

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

What are the input(s) to the Topological Sort Algorithm?

A
  1. DAG

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

What are the output(s) of the Topological Sort Algorithm?

A
  1. array topo[i]: the vertex number of the i’th vertex in topological order from left to right, source to sink, (i.e. in DESCENDING post order number)

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

What is the runtime of the Topological Sort Algorithm?

A

O(n + m)

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

How does the Topological Sort Algorithm work?

A

It works by running DFS on a DAG and using the post order numbers to sort the vertices from highest to lowest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

What order will post order numbers be returned in from the Topological Sort algorithm?

A

In terms of decreasing post order number, i.e. from Source to Sink, which is equivalent to having all edges from left to right in the DAG.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

The topological ordering comes along as a free ride when using the DFS algorithm? Does it need to be manually sorted or not if doing it this way?

A

True. No, it doesn’t require manually sorting. Do an example by hand to see that this is true. (This is also part of why we can achieve an O(n + m) runtime, because if we had to sort that would be a minimum of O(nlogn) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

What are the input(s) to the Explore Algorithm?

A
  1. Graph G = (V, E), directed or undirected
  2. vertex v in V to explore from

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

What are the outputs(s) of the Explore Algorithm?

A
  1. array visited[u] is set to “True” for all nodes u reachable from v
  2. Any other data structure that DFS has, if needed

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

What is the runtime of the Explore Algorithm?

A
  1. O(m) if run as part of DFS
  2. O(n + m) if run by itself and visited[] array needs to be created

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

The runtime of the Explore Algorithm is O(n + m)?

A

True, although not a comprehensive answer. This is True IF explore is run by itself which requires the visited[] array to be created. If run as part of DFS, then explore is O(m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

All of the outputs from DFS are available to the Explore algorithm as well? (T/F)

A

True. (The ccnum, prev, pre, and post, are all actually set in this subroutine; see DPV 3.2.2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

When might it be better to use Explore as our black-box algorithm rather than the full DFS algorithm?

A

If we only need information about a single source vertex, then it’s going to be better to use explore since it provides us with all the same outputs as DFS, but can isolate to only the source vertex of interest instead of running over all vertices)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

The explore algorithm runs on all of the vertices and edges that are reachable from the input vertex v? (T/F)

A

True.

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

What is the runtime of the Floyd-Warshall algorithm?

A

O(n^3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

Compare Dijkstra vs. Bellman Ford vs. Floyd-Warshall Algorithm. When might we want to use each one? What are some of the pros and cons?

A

All of these are examples of algorithms for FINDING SHORTEST PATH(s)

  1. Dijkstra: simple algorithm, and the fastest option at O( (n + m)*log(n)); however, can only use it IF all the edge weights are positive)
  2. Bellman-Ford: Good to use if we have negative edge weights and are only interested in the Single Source Shortest Path (SSSP), which gives us a runtime of O(nm).
  3. Floyd-Warshall: Good to use if we have negative edge weights and are interested in All Pair Shortest Path (APSP); the tradeoff is that this is the slowest order at O(n^3)

Joves’ notes also mention that while Bellman-Ford and Floyd-Warshall are both algorithms available to us as black-boxes, they are technically DP solutions, so think twice about whether using them is really warranted before you go off writing a solution using them).

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

What are the input(s) to the Strongly Connected Components (SCC) algorithm?

A
  1. Graph G=(V, E), directed. (Crucial to remember that SCC is for DIRECTED graphs!!!)

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

What are the output(s) of the Strongly Connected Components (SCC) algorithm?

A
  1. G_scc = (V_scc, E_scc), where G_scc is a meta graph (a DAG) with each SCC in G forming a vertex in V_scc and each edge between SCCs in E_scc
  2. ccnum[.] comes from DFS used underneath the hood of this algorithm
  3. V_scc will look like 1, 2, 3, 4… which are the ccnums
  4. E_scc will look like (1, 2), (2, 3), (3, 4), … which are edges between V_scc

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

If we run SCC on some graph G, and we have two vertices u and v that we want to find which SCC u and v are in, how might we do this?

A

Look up ccnum[u] and ccnum[v]

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Your examining some code that uses the SCC algorithm, and you notice a line with the conditional statement “if ccnum[u] == ccnum[v]: “. What is this conditional statement checking for?

A

It is checking to see if u and v are part of the same strongly connected component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

What is the runtime of the SCC algorithm?

A

O(n + m)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

What algorithm is used under the hood of SCC? How many times is it run? Why is it run multiple times?

A

DFS. Twice. Because we have to run on both G and the reverse of it G_r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

Every directed graph is a DAG of its strongly connected components? (T/F)

A

True.

See property on DPV 3.4.1, pg. 92.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

What is the proper conception of the idea of connectivity on a DIRECTED graph?

A

Two nodes u and v of a directed graph are connected if there is a path from u to v AND a path from v to u.

See DPV 3.4.1, pg. 91.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

Describe how the SCC algorithm works?

A
  1. Run DFS once with pre/post order numbering on reverse graph G_r
  2. Sort V in descending post order number (This gives sink to source)
  3. Run DFS again on G with V sorted
  4. Output will have ccnum representing SCC with higest = source, lowest = sink
  5. Use the ccnum to gather up vertices belonging to each SCC
  6. Check all the original edges and their endpoints. If the endpoints are in different SCCs, and a corresponding E_scc between those SCCs does not already exist, we add a E_scc to represent the edge from one SCC to another.

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

In the SCC algorithm, the output with the highest ccnum is a ______ and the lowest is a ______?

A

Source; Sink

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

What are the input(s) to the 2SAT algorithm?

A
  1. Conjunctive normal form (CNF) formula f, which contains n variables (x1, x2, … x_n), m clauses (x1 v x2) ^ (x3 v !x4)…

Each clause has at most two literals (and for this course, we can assume that it is EXACTLY two literals).

Source: Joves’ Notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

For the 2SAT algorithm, any clause with one literal can be satisfied and removed leaving us with a CNF f’ of only clauses with exactly two literals? (T/F)

A

True

110
Q

For the 2SAT algorithm, we can assume that we are working with clauses that contain EXACTLY two literals? (T/F)

A

True

111
Q

What are the output(s) of the 2SAT algorithm?

A
An assignment (T or F) for each variable in f if it can be satisfied.
Otherwise, 'NO' if it cannot be satisfied

Source: Joves’ Notes

112
Q

What are the input(s) to Kruskal’s algorithm?

A
  1. Graph G=(V, E), connected, undirected; edge weights w
113
Q

What are the output(s) of Kruskal’s algorithm?

A
  1. A minimum spanning tree (MST) defined BY THE EDGES E_mst
114
Q

What is the runtime of Kruskal’s algorithm?

A

O(m log m) or simplified to O(m log n)

115
Q

Describe how Kruskal’s algorithm works?

A
  1. Sort edges by weight
  2. Grab lightest available edge THAT WONT CREATE A CYCLE. (If multiple edges with the same weight are available, it doesn’t matter which one we pick, so long as it doesn’t create a cycle.)
  3. Keep doing this until all edges that will not create a cycle have been added (which will happen at exactly n - 1 edges)
116
Q

At what point will we stop adding edges in Kruskal’s algorithm

A

When we’ve added exactly n - 1 edges (which is the definition of a tree)

117
Q

When might we want to use Kruskal’s algorithm as one of our black-boxes?

A

It is one of the more common MST algorithms, so if we think we need MST, it’s a good one to start with. The output gives us a set of edges which is useful to construct the next input for a black box, or to compare to the original G. Unlike Prim’s, it does not give us any path information.

118
Q

What are the input(s) to Prim’s algorithm?

A
  1. Graph G=(V, E), connected, undirected; edge weights w
119
Q

What are the output(s) of Prim’s algorithm?

A
  1. A minimum spanning tree (MST) defined BY THE ARRAY prev
120
Q

What is the runtime of Prim’s algorithm?

A

O(m log m) or simplified to O(m log n)

121
Q

Describe how Prim’s algorithm works?

A
  1. Start with arbitrary vertex v and put it into a subtree S of included vertices
  2. In each iteration, grow S by adding the LIGHTEST EDGE between a vertex in S and a vertex outside of S
  3. Continue until all vertices are in S (this will happen at exactly n - 1 edges)
122
Q

What other black-box algorithm is Prim’s algorithm similar to? Why?

A

Dijkstra’s algorithm. It’s similar in that it can give us path information in the MST output.

123
Q

What is one reason we might pick the less commonly used Prim’s algorithm over Kruskal’s algorithm?

A

Prim’s algorithm gives us path information as part of the MST output, so if we need path information, that might be one reason to pick it over Kruskal’s.

In general, if a problem is asking about MSTs, it’s probably best to start by thinking about using Kruskal’s; ask yourself if you REALLY need path information. If so, Prim’s is a good option, but it’s definitely the less common scenario.

124
Q

Prim’s algorithm provides us with edge information as part of the MST output? (T/F). Why or why not?

A

False. Because the output of Prim’s algorithm is the array “prev” that contains the paths of the MST, whereas Kruskal’s algorithms gives us the edges of the MST in the form of E_mst.

125
Q

What is the difference in the output(s) between Kruskal’s and Prim’s algorithm for MSTs?

A

Kruskal’s outputs the MST EDGES E_mst, whereas Prim’s outputs the MST PATHS.

126
Q

What are the input(s) to the 2-SAT algorithm?

A
  1. Conjunctive Normal Form (CNF) formula “f” that contains:
    n variables [x1, x2, …, x_n]
    m clauses (x1 v x2) ^ (x3 v !x4) …
    Each clause has at most 2 literals
127
Q

For the 2-SAT algorithm, we can assume we will always be working with clauses with exactly two literals? (T/F) Why or why not?

A

True. Because any clause with one literal can be satisfied and removed leaving us with a CNF f’ of only clauses with EXACTLY two literals

128
Q

What are the output(s) of the 2-SAT algorithm?

A
An assignment (T or F) for each variable in f if it can be satisfied;
"NO" otherwise
129
Q

What is the runtime of the 2-SAT algorithm?

A

O(n + m)

130
Q

Each variable in the 2-SAT algorithm is made up of ____ literals: ____ and _____?

A

two; x and !x

131
Q

If we’re using the 2-SAT algorithm and we have some formula that contains 10 variables, how many literals will we have? Why?

A

There would be 20 literals, because if there are n boolean variables then there will be 2n literals, for example:
[x1, !x1, x2, !x2, … x_n, !x_n]

132
Q

In the 2-SAT algorithm, variables and literals are provided separately? (T/F) What is one important consequence of this?

A

False. They are part of the formula. The consequence is that finding all variables is O(m) because we have to look at all clauses and list them all.

133
Q

What does the “k” in K-SAT mean?

A

It denotes the maximum number of literals in each clause (if there is no limit, then we simply call it a SAT problem)

134
Q

FOR k-SAT problems, if k >= 3 then it is a(n) _________ problem?

A

NP-Complete

135
Q

FOR k-SAT problems, if k=2 then it is a(n) _________ problem?

A

P (i.e. Polynomial)

136
Q

At a high level, what is the 2-SAT algorithm doing?

A

It transforms a 2-SAT Conjunctive Normal Form(CNF) formula into a graph and solves it

137
Q

How does the 2-SAT algorithm work?

Hint: there are 7 steps

A
  1. Create a graph G from f by mapping the CNF to 2n vertices (one for each literal) and 2m edges (one for each implication)
  2. Run the SCC algorithm on G
  3. If for all x in X, x and !x are in DIFFERENT SCCs then the CNF is satisfiable, else return “NO”
  4. Set all SOURCE literals to “FALSE” and remove the SOURCE SCC
  5. Set all SINK literals to “TRUE” and remove the SINK SCC
  6. Repeat steps 4 and 5 until all literals are set
  7. Return the variable assignments
138
Q

When creating the graph for a 2-SAT problem from the CNF, we use ____ number of vertices and ____ number of edges? Why is it this number?

A

2n vertices (one for each literal) and 2m edges (one for each implication).

139
Q

What is the time complexity to ACCESS edge WEIGHTS in an (n x n) matrix?

A

O(1)

140
Q

What is the time complexity to WRITE edge WEIGHTS in an (n x n) matrix?

A

O(1)

141
Q

What is the time complexity to ITERATE over edge WEIGHTS in an (n x n) matrix? Why is this?

A

For this class, TAs have said we can assume O(m).

142
Q

What sort of data structure can we use to store edge WEIGHTS in (given we already have an adjacency list representation of the graph G)? How do we do this?

A

An (n x n) matrix)

We do this by:

  1. Initializing an (n x n) matrix # This is O(1) because cost is only incurred when actually writing values
  2. Set each edge weight in the table # This is O(1) * O(m) = O(m)

By only setting values where there is an edge, we avoid the typical O(n^2) cost of working with an (n x n) table

143
Q

What functions are required for the Union by Rank (Disjoint Sets) data structure? What are the runtimes of all of them?

A
  1. makeset(x): create a singleton set containing just x # O(1)
  2. find(x): to which set does x belong? # O(log n)
  3. union(x, y): merge the sets containing x and y # O(log n)
144
Q

What is the maximum height allowed of any tree in a Union by Rank (Disjoint Sets) data structure? What is one important consequence of this?

A

log(n). This fact means that it is also the upper bound on the running time of the “find” and “union” operations used by this data structure, i.e. O(log n)

145
Q

If an element in a disjoint set has a parent pointer that has a self-loop, what can we conclude?

A

That that element is the root of the tree

146
Q

How should we interpret the “rank” of a node in a disjoint data set?

A

It is the height of the subtree hanging from that node

147
Q

Per the DPV textbook, what is “Property 1” of the Union by Rank (Disjoint Set) data structure?

A

For any x, the rank(x) < rank(pi(x))

where pi here should be interpreted as a function that returns the parent pointer of x.

A better way of looking at this is that the height of the subtree hanging beneath x (which is the definition of “rank”) will always be strictly shorter than the height of the subtree hanging from the parent of x. (Which should be a pretty obvious statement - one rung higher up the ladder is, well… one rung higher up on the ladder.)

148
Q

Per the DPV textbook, what is “Property 2” of the Union by Rank (Disjoint Set) data structure?

A

Any root node of rank k has at least 2^k nodes in its tree

149
Q

For Union by Rank (Disjoint Set) data structures, if there is a root node with a rank of 3, then its tree will have AT LEAST how many nodes?

A

2^k = 2^3 = 8

This is from property 2 for this data structure (see DPV 5.1.4 and Joves’ notes)

150
Q

For Union by Rank (Disjoint Set) data structures, there is a root node with rank k, then what can we deduce?

Hint: use property 1

A

That it was created by the merger of two trees with roots of rank k - 1

151
Q

Per the DPV textbook, what is “Property 3” of the Union by Rank (Disjoint Set) data structure?

A

If there are n elements overall, there can be at most n / 2^k nodes of rank k

Crucially, this implies that the max rank is log(n), therefore all the trees have height <= log(n), which is the upper bound on the running time of the “find” and “union” operations.

152
Q

For Union by Rank (Disjoint Set) data structures, if we have n=16 elements in total, then there could be AT MOST how many nodes of rank 3?

Hint: use property 3

A

Since we’re interested in how many nodes there could be with rank 3, this means k = 3. Using property 3, we know that n_k <= n / 2^k, i.e.:

16 / 2^3 = 16 / 8 = 2, therefore final answer is there could be at most 2 nodes with rank 3.

153
Q

What is the runtime of the Bellman-Ford algorithm?

A

O(nm)

154
Q

What is the runtime of the Floyd-Warshall algorithm?

A

O(n^3)

155
Q

What are three shortest paths black box algorithms available to us? List some of the pros and cons of each.

A
  1. Dijkstra’s algorithm: If we need simple algorithm for non-negative, weighted SSSP
    Pros: Fast, simple
    Cons: Does NOT work for negative weights
  2. Bellman-Ford: if we need to get the Single Source Shortest Path (SSSP) on a graph that includes negative weights
    Pros: works with negative weights, faster than Floyd-Warshall [O(nm) vs O(n^3)]
    Cons: only provides answer for single source
  3. Floyd-Warshall: if we need to get the All Pair Shortest Path (APSP) on a graph that includes negative weights
    Pros: works with negative weights, gives info for all vertices
    Cons: slowest option of the three [O(n^3)]
156
Q

Dijkstra’s is an example of a ______ algorithm, whereas Bellman-Ford and Floyd-Warshall are examples of ______ algorithms?

A

Greedy; dynamic programming

157
Q

The explore algorithm gives us a list of vertices reachable from source vertex s? (T/F)

A

True. This is specific to explore without any additional details. We (the TAs) know this is an output and we simply read it as such.

Source: Joves’ notes

158
Q

The DFS algorithm gives us a list of vertices reachable from source vertex s? (T/F) Why or why not?

A

False. Without additional details, this is NOT something that DFS gives us. The visited array DFS returns is going to be all ‘True’ by the end of each run.

VERY IMPORTANT TO REMEMBER THIS.

159
Q

You can always rely on DFS to give you the vertices that are reachable from a source vertex s? (T/F) Why or why not?

A

False. Without additional details, this is NOT something that DFS gives us. The visited array DFS returns is going to be all ‘True’ by the end of each run.

VERY IMPORTANT TO REMEMBER THIS.

160
Q

Explore and DFS can both get us a path from s? (T/F) How do we do this, or if not possible, why can’t it be done?

A

True. For explore, we simply run ‘explore(s)’ For DFS, we “start” it from s by putting s at the head of the list, so we would run DFS(G, s)

161
Q

How can we use DFS to find all reachable vertices from s?

A

for v in V, find all v with the same ccnum as s. These are reachable by s.

This is a detail you need to give, because it is NOT a free output of DFS!

Source: Joves’ notes

162
Q

In a DAG, every vertex leads to a vertex with a _____ post number?

A

lower

See DPV 3.3.2 pg. 90

163
Q

What is property 1 of the SCC algorithm?

A

If the explore subroutine is started at node u, then it will terminate precisely when all nodes reachable from u have been visited

See DPV 3.4.2 pg. 92

164
Q

What is property 2 of the SCC algorithm?

A

The node that receives the HIGHEST post number in a DFS must lie in a SOURCE SCC

See DPV 3.4.2 pg. 92

165
Q

What is property 3 of the SCC algorithm?

A

If C and C’ are SCCs, and there is an edge from a node in C to a node in C’, then the HIGHEST post number in C is bigger than the HIGHEST post number in C’.

This can be restated as saying that the strongly connected components can be linearized by arranging them in DECREASING order of their highest post numbers.

See DPV 3.4.2 pg. 92

166
Q

In a DAG, each node is a singleton strongly connected component? (T/F) Why or why not?

A

True. This about it: in a DAG, you can ONLY go one direction. You can never go backwards to a previous vertex, so each vertex must be its own SCC.

167
Q

Back edges behave differently than all the other types of edges? (T/F) Why or why not?

A

True. With back edges, the post numbers go UP. For all other types of edges (tree, forward, cross), the post numbers go DOWN.

168
Q

For tree edges, the post order number goes ____?

A

down (and the edge is part of the DFS tree)

169
Q

For forward edges, the post order number goes ____?

A

down (and the edge points down to some non-child descendant in the tree)

170
Q

For back edges, the post order number goes ____?

A

up

171
Q

For cross edges, the post order number goes ____?

A

down (and the nodes are not descendants of each other, i.e. they are siblings or cousins)

172
Q

If a directed graph G has no cycles in it, what is one thing that we can infer?

A

That there are no back edges in the graph (This comes from the cycle property - see GR1_9)

173
Q

If a graph G has a back edge in it, what is one thing we can infer?

A

That a cycle is present in the graph.

174
Q

A DAG might have back edges present? (T/F) Why or why not?

A

False. Because of the cycle property. If a back edge is present, then there MUST be a cycle in the graph. Since DAGs are by definition acyclic, a DAG will never have a back edge.

175
Q

For topologically sorting a DAG, the time to sort the vertices in order of decreasing post number will take O(n log n) time? (T/F) Why or why not?

A

False. This is because when we run DFS for the topological sort algorithm, the clock starts at 1, so the minimum value the post number can take on will be one. The maximum value will be 2n. This means we just initialize an array of size 2n, and stick the vertices in the array as we see them, which just gives us the vertices in decreasing post number. This only takes linear time to do, i.e. O(n + m)

TODO: not 100% sure why the size is 2n and not just n

176
Q

A DAG always has at least one ______ and at least one _____?

A

source; sink

177
Q

What is one property that will always be true of the SOURCE vertex in a DAG?

A

It will always have the HIGHEST post number

178
Q

What is one property that will always be true of the SINK vertex in a DAG?

A

It will always have the LOWEST post number

179
Q

Define connectivity in a DIRECTED graph?

A

Two vertices v and w are connected if there is a path from v –> w AND a path from w –> v

180
Q

What is the definition of an SCC? How does it differ from a regular CC?

A

An SCC is the maximal set of strongly connected vertices in a directed graph.

This differs from a CC on an undirected graph because a CC in that graph is just the maximal set of connected vertices (i.e. the ‘strong’ part doesn’t make sense in the context of an undirected graph. In an undirected graph there is always a path from u –> v and v –> u (obviously by just traversing the same edge in forward/reverse)

181
Q

What is one important property of a metagraph drawn on strongly connected components? Why is this the case?

A

It is a DAG.

Proof by Contradiction: Consider the alternative. If there was a path between an SCC S and another SCC S’ and also a path from S’ → S, then we would have a cycle, and it would mean that we could always get from anywhere in S to anywhere in S’, and vice-versa. This means that union(S, S’) would be an SCC, i.e. they would be in the same SCC, which obviously contradicts that they are in different SCCs.

182
Q

Every directed graph is a DAG of it’s SCCs? (T/F)

A

True. This is because you could never have a cycle between two SCCs S and S’, because if there was, then they would actually be part of the same SCC.

183
Q

What is the main idea of the SCC algorithm?

Hint: There are four key steps

A
  1. Find a sink SCC S
  2. Output it,
  3. Remove it
  4. Repeat
184
Q

For the SCC algorithm, is it better to run it from source SCCs or sink SCCs? Why?

A

Sink SCCs. Because the explore algorithm when run from a sink SCC will ONLY explore the vertices in that SCC and NOTHING ELSE.

185
Q

In a general directed graph G, the vertex with the LOWEST post number always lies in a SINK SCC? (T/F)

A

False. This has important implications for our SCC algorithm, namely it is why we reverse the graph G to invert sources and sinks.

186
Q

In a general directed graph G, the vertex with the HIGHEST post number always lies in a SOURCE SCC? (T/F)

A

True. This has important implications for our SCC algorithm, namely it is why we reverse the graph G to invert sources and sinks.

187
Q

A source SCC in G == sink SCC in G_r? (T/F)

A

True. This is what allows our SCC algorithm to work by inverting sources/sinks.

188
Q

A sink SCC in G == source SCC in G_r? (T/F)

A

True. This is what allows our SCC algorithm to work by inverting sources/sinks.

189
Q

What is the runtime of the SCC algorithm and why is this the case?

A

O(n + m). Because under the hood SCC is simply running DFS twice, so O(n + m) + O(n + m) = 2*O(n + m) –> O(n + m)

190
Q

Why is the SCC algorithm correct?

A

The proof is based on the claim that the vertex with the highest post number always lies in a source SCC. We can prove this by making the simpler claim that for two SCCs S and S’ if there is a path from vertex v in S to vertex w in S’, then max(post(S)) > max(post(S’)), therefore if we topologically sort by max post number, the highest number is guaranteed to be in the source SCC.

It’s easier to think of this visually. If there is a path from SCC S to S’ and we start DFS from some vertex z in S’, then we would visit all of S’ and none of S. Conversely, if we start DFS from S, then the subtree of vertex z would contain the union(S, S’). This is what forces the condition that max(post(S)) > max(post(S’)),

191
Q

In some case, there might be a path from an SCC S to an SCC S’ and a path from S’ –> S? (T/F)

A

False. If there was such a bidirectional path, then both S and S’ would be in the same SCC, which is a contradiction! Remember: a metagraph drawn on strongly connected components will always be a DAG i.e. no cycles will exist.

192
Q

What is the output of DIjkstra’s algorithm?

A

for all v in V, it returns dist(v) = length of shortest path from s –> v

193
Q

Say you have some undirected, non-weighted graph G=(V,E), and you run BFS on it for some vertex s in V, i.e. BFS(G, s). You’re interested in the distance from s to some vertex u. Now a few questions:

(a) if you use a bit of pseudocode “dist(u)”, what will this return?
(b) say you run the pseudocode “dist(u)” and it returns “inf”. What does this mean?
(c) it would be valid to run the pseudocode “prev(u)”? (T/F). If True, what would it return?

A

(a) the min # of edges from s to u (or inf if a path doesn’t exist)
(b) that there isn’t a path from s to u
(c) True. It would return the previous vertex on the path from s to u.

194
Q

What is the runtime of Dijkstra’s algorithm (at least for this class) and what data structure do we use?

A

For this class we assume O((n + m)*log n) using the “min-heap” data structure

195
Q

What are the logical symbols we use for AND and OR?

A

^ == AND (remember that it looks kind of like an “A”

v == OR

196
Q

For what value of k is k-SAT NP-Complete for?

A

For all k >= 3

197
Q

There is no polynomial time algorithm for solving 2-SAT problems? (T/F)

A

False. There is indeed such a polynomial time algorithm. k-SAT problems are NP-Complete for all k >= 3

198
Q

In the context of SAT problems, what is a “unit clause”? What is important about these types of clauses.

A

A clause with only one literal in it, e.g. (!x1) ^ (x1 v x4), the (!x1) is an example of a unit clause.

The important thing about these clauses is that for a unit clause in a CNF, a unit clause MUST be satisfied in order for a solution to the formula to exist. This means that the unit clause imposes a constraint. So in the above example of the unit clause (!x1), this means that it MUST be that x1=False in order for the unit clause to evaluate to True.

199
Q

How can we simplify the inputs for a 2-SAT problem?

A

We can simplify by doing:

  1. Take a unit clause, say literal a_i
  2. Satisfy it (set a_i = T)
  3. Remove clauses containing a_i and drop !a_i
  4. Let f’ be the resulting formula
  5. Repeat the process on f’ until we get down to clauses of size=2
200
Q

For a 2-SAT problem represented as a graph, how many vertices are there?

A

2n vertices, which correspond to all the literals x1, !x1, x2, !x2, …, x_n, !x_n

201
Q

For a 2-SAT problem represented as a graph, how many edges are there?

A

2m edges, which correspond to 2 “implications” per clause

202
Q

For a 2-SAT problem, when representing it as a graph we use a directed or undirected graph? Why?

A

Directed. This is a somewhat hand-wavy answer, but we’re using the SCC algorithm on the graph, which only makes sense in the context of directed graphs.

203
Q

You have some CNF formula f that defines some 2-SAT problem. You draw the graph out, and discover there is a path from x1 –> !x1. What can you conclude from this? Why?

A

That f is not satisfiable. Because we end up with a contradiction where with have to set x1=True and x1=False, which obviously can’t be done.

(It’s easiest to see this by looking at one of the example problems from the lectures, like GR2_7)

204
Q

You have some CNF formula f that defines some 2-SAT problem. You draw the graph out, and discover x1 and !x1 lie in the same SCC. What can you conclude from this? Why?

A

That f is not satisfiable. Because we end up with a contradiction where with have to set x1=True and x1=False, which obviously can’t be done.

(It’s easiest to see this by looking at one of the example problems from the lectures, like GR2_7)

205
Q

You have some CNF formula f that contains the variables x1, x2, and x3. You draw the graph out and note that for all i in {1, 2, 3}, x_i and !x_i lie in DIFFERENT SCCs. A few questions:

(a) How many literals are there?
(b) Is f satisfiable? Why or why not?

A

(a) 2n = 2(3) = 6 total literals

(b) Yes, it is satisfiable because all x_i and !x_i lie in different SCCs.

206
Q

You have some CNF formula f that contains the variables x1, x2, and x3. You draw the graph out and note that for all i in {1, 2, 3} there is at least one value of i for which x_i and !x_i lie in the same SCC. Is f satisfiable? Why or why not?

A

No, it is not. In order for f to be satisfiable all the x_i and !x_i must lie in DIFFERENT SCCs.

207
Q

In the context of 2-SAT problems we set the sink SCCs !S’ = ______ and the source SCC S’ (i.e. the complement) = _____?

A

True; False

208
Q

In the context of 2-SAT problems we set the source SCC S’ = ______ and the sink SCC !S’ (i.e. the complement) = _____?

A

False; True

209
Q

if for all i, x_i and !x_i are in DIFFERENT SCCs then what can we conclude?

A

That S is a sink SCC and !S is a source SCC (and vice-versa)

210
Q

In the context of 2-SAT, if there is a path from alpha –> beta, then there is a path from !beta –> !alpha (and vice-versa)? (T/F)

A

True

211
Q

We run the SCC algorithm on some CNF formula f, and the output returns variable assignments for all the variables in the formula? What can we conclude about the formula?

A

That for all x in X, x and !x lie in DIFFERENT SCCs.

212
Q

We run the SCC algorithm on some CNF formula f, and the output simply returns ‘NO’. What can we conclude about the formula and why is this the case?

A

We can conclude that f is not satisfiable. This must be because there is AT LEAST one x in X for which x and !x lie in the SAME SCC.

213
Q

Say we have a graph that contains multiple subgraphs (i.e. a forest). How would the results of DFS differ from those of BFS on this graph?

A

DFS would restart the explore subroutine once it explored each of the different subgraphs. BFS, on the other hand, would simply ignore nodes that aren’t reachable from the start vertex s (in other words, the distance to nodes that aren’t reachable from s in the rest of the forest will all be inf).

214
Q

BFS only works to find shortest paths on graphs whose edges have unit lengths? (T/F)

A

True.

215
Q

What are the three important properties of trees?

A
  1. Tree on n vertices has n - 1 edges
  2. Exactly one path between every pair of vertices
  3. Any connected graph G=(V,E) with |E|=|V| - 1 is a tree. (This one is especially important for the cut property)
216
Q

What data structure is used in Kruskal’s algorithm?

A

Disjoint Set (Union-by-Rank)

217
Q

Kruskal’s algorithm is an example of a ______ approach?

A

greedy

218
Q

If a subgraph is connected and has n - 1 edges, what would we call that subgraph?

A

A tree

219
Q

What is key idea we should take away from the PROOF of the Cut Property?

A

We can take a tree T and if we add an edge into that tree, e.g. T’ = T ∪ e* - e’, by first adding the edge e* that creates a cycle (because a tree always has n - 1 edges, so if I add an edge into an existing tree, we are guaranteed to get a cycle.) If we then remove any edge from that cycle, we get a new tree T’

220
Q

What is the key idea we should take away from the STATEMENT of the Cut Property?

A

A minimum weight edge across the cut is part of a MST

221
Q

You have a directed graph G and see an edge e=(zw), and that w is a non-child descendant of z. You run DFS and see that the post(z) > post(w); what type of edge is this?

A

Forward edge

222
Q

You have a directed graph G and see an edge e=(zw), and you notice that z and w are not descendants or ancestors of each other. You run DFS and see that the post(z) > post(w); what type of edge is this?

A

Cross edge

223
Q

Per Joves’ comment on Ed, how should we view the output of the SCC algorithm?

A

“You should consider the output of SCC to be just another graph. But each vertex is a ccnum, and each edge uses ccnums.”

Source: https://edstem.org/us/courses/22308/discussion/1589332?comment=3603438

224
Q

For a max flow problem, the problem is not solvable if you have cycles present in your graph? (T/F)

A

False. This is in contrast to the shortest path problem, where having cycles were problematic (at least in the case of things like Dijkstra’s algorithm).

225
Q

When building the residual network for a max flow problem, it is important to remove anti-parallel edges first? (T/F)

A

True.

Source: MF1_13 at end of video

226
Q

In the Ford-Fulkerson algorithm, how do we check if there is a path from s –> t in the residual graph for a max flow problem? If there is no such path, what happens?

A

Use DFS or BFS. If the path doesn’t exist, then the algorithm just outputs the current flow f.

Source: MF1_14

227
Q

What is assumption does the Ford-Fulkerson algorithm make of its inputs that allows it to achieve its specified runtime?

A

It assumes all capacities are integers. More sophisticated max-flow algorithms like Edmunds-Karp eliminate this requirement.

228
Q

How does the assumption that all the capacities are integers allow the Ford-Fulkerson algorithm for max-flow allow it to achieve its running time?

A

If we assume that all capacities are integers, then that means flow increases by >= 1 unit per round. So if C = size of the max flow, then we know there will be <= C rounds of the algorithm run.

229
Q

What step in the Ford-Fulkerson algorithm dominates the runtime? What does the final runtime end up being and why?

A

Finding a path from s –> t using DFS or BFS in each round of the algorithm. Since for Ford-Fulkerson we worst case run <= C rounds, where C=size of the max flow, we end up with a final runtime of O(mC)

230
Q

The Ford-Fulkerson algorithm is a strictly polynomial time algorithm? (T/F) Why or why not?

A

False. It is pseudo polynomial. This is because the runtime is O(mC), where C=size of the max flow. Making the runtime dependent on the input value for C makes the problem pseudo-polynomial (like the knapsack problem).

231
Q

When does the Ford-Fulkerson algorithm terminate?

A

When it can no longer augment the path in the residual graph G_f*, i.e. when there is no longer a path from s –> t

232
Q

You are running the Ford-Fulkerson algorithm on a max-flow problem, and it suddenly terminates. What can we conclude? What do we know about the flow f* that it outputs?

A

That there is no longer an augmenting path in the residual graph G_f. Based on this we know that f is a max-flow.

Source: See MF2_2

233
Q

What are the input(s) to the Ford-Fulkerson max-flow algorithm?

A
  1. Flow network: directed graph G=(V,E), with start and end vertices s,t in V and with capacities c_e > 0

Note that for Ford-Fulkerson it requires that all capacities are integer values as well.

234
Q

What are the output(s) to the Ford-Fulkerson max-flow algorithm?

A
  1. Flow f* of max size.
235
Q

In the context of max-flow, how is the SIZE of a flow f defined?

A

It’s the amount of flow either out of start vertex s or in to end vertex t (which have to be equal because of conservation of flow, i.e. size(f) = f_out(s) = f_in(t) )

236
Q

In the context of the “min-cut problem”, what is an “st-cut”?

A

It is a cut that partitions the vertices V into two sets such that s is in L and t is in R

237
Q

How is the capacity of an st-cut defined?

A

The capacity(L,R) = sum of the edge capacities from L–> R

238
Q

When looking at the capacity of an st-cut, we sum the edge capacities from R –> L? (T/F)

A

False. We sum from L –> R

239
Q

Checking that there is no path from s–>t in the residual flow network G_f* to find the max flow only works for the Ford-Fulkerson algorithm? (T/F)

A

False. Checking for the path is actually the second part of the proof for the max-flow = min-cut theorem, so any algorithm that uses this property is guaranteed to be correct because of that general theorem.

240
Q

Compare and contrast the Ford-Fulkerson algorithm and Edmonds-Karp algorithm for max-flow.

A

Ford-Fullkerson:

  1. Find augmenting paths using DFS or BFS
  2. Runs in O(mC) time, where C is size of max flow
  3. Assumes integer capacities

Edmonds-Karp:

  1. Find augmenting paths using BFS
  2. Runs in O(n*m^2) time
  3. No assumptions on capacities
241
Q

What is the main difference under the hood between the Ford-Fulkerson and Edmonds-Karp algorithm for max-flow?

A

Edmonds-Karp uses BFS to search for an st-path in the residual flow network graph, whereas Ford-Fulkerson can use DFS or BFS.

242
Q

Say you’ve drawn a graph G’ for a max flow problem that includes all the extra edges required to incorporate demand into the problem. A few questions:

(a) What could you conclusively say about the vertices s’ and t’ that we “hide” the original graph inside of?
(b) say the size of the flow f

A

We would know for certain that at a minimum, the max flow would be upper bounded by the flow OUT of s’ or by the flow IN to t’.

243
Q

When is a flow “saturated”? Why are saturating flows so important?

A

If we’ve fully capacitated all the edges out of s’ and into t’, then the flow is saturated.

They are important because if G’ has a saturating flow, then G has a FEASIBLE flow, and VICE-VERSA.

244
Q

Say you are looking at some flow network graph G’. You find a flow f’ that fully capacitates the edges out of s’. What does this imply about the original flow network graph G?

A

What has happened is that you’ve found a saturating flow, which implies that G has a feasible flow.

245
Q

Say you are looking at a homework problem of some flow network graph G and someone tells you that they know that G has a feasible flow. What does this imply about G’ ?

A

That G’ has a SATURATING flow.

246
Q

What is the runtime of the Edmonds-Karp algorithm?

A

O(n*m^2) time

247
Q

Max-flow algorithms always use a _________ graph?

A

directed

248
Q

What are the input(s) to a max-flow problem?

A

G=(V,E), c, s, t

c: capacities such that c(u,v) > 0 for (u,v) in E
s: source vertex
t: sink vertex

249
Q

What are the output(s) of the max-flow algorithms?

A

f*

f is a function or vector such that f(u, v) gives a flow of (u,v). It gets updated as the algorithm runs

f* is the MAX-FLOW across G such that sum[f*(…, t)] is maximized

size(f) is the numerical value of flow f calculated as sum[f*(…, t)]

250
Q

You have the output f from some intermediate run of a max-flow algorithm. You lookup the flow across edge e=(A,B) that has a capacity c=6 by running f(A, B), and see that the flow across e at the current iteration is also 6. What can you say about the edge from A to B? Why is this the case?

Hint: Remember that f is a function or vector that returns the flow for some edge e=(u,v)

A

That is is saturated. Because the current flow across the line is equal to the max capacity of the line.

251
Q

What do we mean (precisely speaking) when we say we are maximizing flow?

A

f* is the max flow across G such that ∑f(., t) is maximized; f is what gets returned from our max-flow algorithms.

Key Idea:
All this is saying is that this function of all flow into the sink vertex is what gets maximized.

252
Q

For a max-flow problem, what does size(f) return?

A

tl;dr: size(f) = max-flow into the sink t, which is the overall capacity C of the flow network.

size(f) is numerical value of flow f calculated as ∑f*(., t);
equivalently, size(f) = C, i.e. we define the max-flow into the sink t as the overall capacity C of the flow network

253
Q

Ford-Fulkerson algorithm for max-flow must always use DFS for checking for a path from s –> t? (T/F).

If DFS is the only valid choice, why is that? If not, what are the alternatives?

A

False. Ford-Fulkerson can be thought of as more of a template algorithm for solving max-flow problems, so how we check for that path is left undefined.

Options for path checking then could be: DFS, Explore, or BFS.

254
Q

We’re running a max-flow algorithm, and at the end of one of the iterations the algorithm finds that there is still a path from s –> t in the residual flow network G_f. What can we conclude about the residual network, and also about the original flow network G?

A

It means that there must still be SPARE CAPACITY along the path from s –> t. This means that our algorithm won’t terminate yet, because as long as there is a path from s –> t, then we haven’t yet found a max-flow for the original flow network G.

255
Q

A graph may have multiple min st-cuts? What is one important result of this? (T/F)

A

True. As a result, there can be multiple valid solutions to a max-flow problem.

256
Q

If you tried to run the Ford-Fulkerson algorithm on a flow network with non-integer capacities, what might happen?

A

The algorithm might not terminate.

257
Q

For the Ford-Fulkerson algorithm, there might be rounds during the iterative runs of the algorithm where the flow does not increase? (T/F) What is the implication of the answer?

A

False. The flow is guaranteed to increase by >= 1 per round.

The implication then is there are at most C rounds, where C is the size of the max flow.

258
Q

Why is the Ford-Fulkerson algorithm only pseudo-polynomial?

A

Because it runs in O(mC) time, but C (the size of the max flow) has NOTHING to do with the size of the input.

Think about it. The value for C is arbitrary; specifically it is a magnitude. But it doesn’t have anything to do with the size of the input flow network G. Since the runtime isn’t strictly a function of the input size then, we say it is pseudo-polynomial.

Source: OH Week #6, 1:03:30

259
Q

Is the Ford-Fulkerson algorithm strictly slower than the Edmonds-Karp algorithm? (T/F) Why or why not?

A

False, because Ford-Fulkerson is pseudo-polynomial. One situation where it could be faster is if C is bounded by n or m. In that case you could argue that since C is at most n or m, then C=n=m, so instead of the usual runtime of O(mC), it would be O(mn) or O(m^2), as opposed to O(nm^2) for Edmonds-Karp.

This is one of those really subtle things that I suspect could be tested on an exam, so remember this one!

260
Q

Between Ford-Fulkerson and Edmonds-Karp algorithm for max flow, which one is faster? Why?

A

It depends. Edmonds-Karp is generally considered faster, FOR UNBOUNDED CAPACITIES. This is because of the pseudo-polynomial nature of Ford-Fulkerson. In the case where C is bounded to be at most n or m, then Ford-Fulkerson could be faster.

261
Q

You’re given a max-flow problem, and some oracle tells you that they know for certain that Ford-Fulkerson will be faster on this problem than Edmonds-Karp. What can you deduce from this about the problem? Why?

A

That the size of the max flow C is bounded by n or m.

Because of the pseudo-polynomial nature of Ford-Fulkerson. In the case where C is bounded by n or m, we’ve fixed C so that it is no longer a magnitude and is instead directly bounded by the size of the input.

262
Q

What is a “bridge” in a graph? What type of graphs do they apply to?

A

Any edge that disconnects the graph. They apply to undirected graphs.

263
Q

What would be an equivalent way of running the topological sort algorithm on a DAG?

A

Run DFS on the graph and then order the post nums in DESCENDING order (i.e. largest to smallest).

Note: In practice, just say “Run topological sort on G”. I really just created this question to reinforce the conceptual connection between topological sorting and post numbers from DFS runs.

264
Q

The vertex with the highest post order number and lowest post order number are guaranteed to be the ONLY source and sink in a DAG? (T/F)

A

False. There may be multiple sources and sinks.

265
Q

The 2SAT algorithm is O(n + m) for any number of literals? (T/F)

A

False. It is only O(n + m) if the problem can be reduced to where each clause has at most 2 literals.

266
Q

You have a directed graph G and run the SCC algorithm. You find that the G_scc has the same number of vertices as the original graph G. What can we conclude about the graph?

A

If the number of SCCs = n in the original graph, then there must not have been any cycles present in the graph. Hence every vertex in the original graph has now become a singleton SCC in G_scc.

267
Q

You have a directed graph G and run the SCC algorithm. You find that the G_scc has fewer vertices than the original graph G. What can we conclude about the graph?

A

There must have been at least one cycle present such that at least or more of the nodes were compressed into the same SCC.

268
Q

After running the SCC algorithm, what would the runtime be to count how many SCCs you have? Why?

A

O(n). Because in the worst case if no cycles were present in the original graph then every vertex in G would become a singleton SCC in G_scc, so in worst case you would have to count n vertices.

269
Q

You can use BFS to check if some vertex s can reach another vertex v? (T/F). If you can, how would you do this?

A

True. You would run, BFS(G, s) and then check if dist[v] = inf. If dist[v] != inf, then a path exists, otherwise False. (We check for inf because this is how the distances are initialized in BFS.)

270
Q

At a high level, what is the “cycle property” of an MST?

A

The heaviest edge of any cycle can not be part of any MST.