CS6515_Exam2 Flashcards
If you are given an UNDIRECTED graph you can only find ________ _________?
connected components
If you are given a DIRECTED graph, you can only find ________ ________ _______?
strongly connected components
When given an undirected graph, one of the best things to do first is look for strongly connected components? (T/F)
False. The concept of strongly connected components doesn’t exist for undirected graphs.
In the 06/15 Office Hours, Joves’ mentions big penalties will be applied for things like this.
How do you find sources and sinks on an undirected graph?
Trick question. Sources and sinks don’t exist on undirected graphs. They only make sense in the context of directed graphs.
What are some important properties/characteristics of Directed Acyclic Graphs (DAGs)?
Hint: DPV textbook mentions there are 3 of them
- A directed graph has a cycle if and only if its DFS reveals a back edge
- In a DAG, every edge leads to a vertex with a lower post number
- Every DAG has at least once source and at least one sink
What is a source node?
A node with in-degree of 0, i.e. no incoming edges
What is a sink node?
A node with out-degree of 0, i.e. no outgoing edges
If your graph has no weights, then you can’t calculate a minimum spanning tree? (T/F) Why or why not?
This is really a trick question, the answer can be True or False, depending on perspective. In reality, MSTs only really make sense on a weighted graph. For an unweighted graph, the concept of “minimum” isn’t well-posed, so technically ANY spanning tree on an unweighted graph is minimal, but it’s essentially a useless definition (consider that it would also be a minimum spanning tree == maximal spanning tree in the unweighted case; a distinction without a difference).
What are the 9 algorithms (at minimum) that we are allowed to use as black-boxes for HW/Exams?
- Dijkstra’s
- BFS
- DFS
- Explore (subroutine of DFS, but allowed to use as standalone algo)
- Strongly connected components (SCC) [Joves’ mentions this one as being especially important]
- 2SAT Algorithm
- Kruskal’s Algorithm
- Prim’s MST Algorithm
- Anything else mentioned in the Lectures and/or Textbook
Source: 06/15 Office Hours
We should NEVER modify the black-box graph algorithms for the HW/Exam? (T/F)
True. Joves’ emphasizes this multiple times in 06/15 Office Hours.
The strongly connected components algorithm is analogous to what other algorithm?
SCC uses multiple runs of DFS, with some modifications to the input and output variables.
DFS/BFS do NOT consider edge weights? (T/F)
True.
What is the runtime of DFS/BFS?
O(n + m), i.e. one of the fastest options we have, but with the caveat that if we need to consider edge weights, then that adds additional time complexity.
What are the runtimes of all the key black-box graph algorithms?
- BFS: O(n + m)
- DFS: O(n + m)
- Dijkstra’s: O((n + m) * log n)
- Explore: O(m) if part of DFS; O(n + m) if standalone
- Topological Sort : O(n + m)
- SCC: O(n + m)
- 2-SAT: O(n + m)
- Kruskal’s MST Algo: O(m log n)
- Prim’s MST Algo: O(m log n)
- Ford-Fulkerson: O(mC)
- Edmonds-Karp: O(nm^2)
- Bellman-Ford: O(nm)
- Floyd-Warshall: O(n^3)
We should ALWAYS use adjacency list representation for graphs in this class, and NEVER use adjacency matrices? (T/F)
True. Joves’ mentions in office hours that the gains from using the matrix form are minimal, and if implemented incorrectly can cause runtimes to massively explode.
What is the runtime to index into a vertex in a graph?
For this class, we assume O(1)
What is the runtime to find an edge in a graph? Why?
O(m). Because we have to scan over the edges in the adjacency list up to m edges.
For exams, we need to explicitly define that n = |V| and/or that m = |E|? (T/F)
False. Joves’ says in 06/15 Office Hours we can just assume this as understood without explicitly writing down.
To traverse a graph or find all edges, what is the runtime?
O(n + m)
Why is traversing a graph or finding all the edges O(n + m) and not O(nm)?
Because if it was O(nm), that would imply that EVERY vertex is connected to EVERY edge. This is obviously not the case. Also, the runtime for BFS/DFS is O(n + m), which allows us to traverse all the edges in that amount of time.
Joves’ mentions a lot of students make this mistake.
As mentioned in lectures/OH/textbook, what is “Property 1” of trees?
Removing a cycle edge cannot disconnect a graph.
As mentioned in lectures/OH/textbook, what is “Property 2” of trees?
A tree on n nodes has n - 1 edges (i.e. the definition of a tree)
As mentioned in lectures/OH/textbook, what is “Property 3” of trees?
Any connected, undirected graph G = (V, E) with m = n - 1 edges is a tree. (This is the converse of Property 2.)
As mentioned in lectures/OH/textbook, what is “Property 4” of trees?
An undirected graph is a tree if any only if there is a unique path between any pair of nodes.
If there is more than one path between any pair of nodes, then there is a cycle
If there is no path between any pair of nodes, then it is not connected.
What is the “Cut Property”?
The Key Idea: Any edge which is minimum weight across a cut is going to be part of SOME MST.
More Precisely:
Considers UNDIRECTED graph G=(V,E)
- Take a subset of edges X ⊂ E, where X ⊂ T, where T is a partial solution (i.e. it is part of some MST) that we assume is correct so far
- Take a subset of vertices S ⊂ V where no edge is in the cut (S, S’)
- Look at all edges of G in cut (S, S’)
- Let e* be min weight edge in cut (S, S’)
- Then: X ∪ e* ⊂ T’, where T’ is a new MST
See DPV 5.1.2, also lecture video GR3_9-10.
TODO: Finish office hours for 06/15
TODO: Start watching around the 1hr 15min mark.
What are some important abstract data types and their associated data structures?
Abstract Type —> Data Structure
- Stack —> Array, Linked List
- Queue —> Array, Linked List
- Priority Queue —> Heap
- Dictionary / Hashmap —> Array
Source: Udemy Algorithms and Data Structures in Python lectures
What is the runtime of accessing the i’th element in an array/list?
O(1). This is because items are stored adjacent to one another in RAM.
What is a “Tree Edge”?
Anytime we visit a new vertex via an edge, we form a Tree Edge.
What is a “Forward Edge”?
An edge that goes from a node to a DESCENDANT in the tree.
What is a “Backward Edge”?
An edge that goes from a node to an ANCESTOR in the tree.
What is a “Cross Edge”?
Any other edge that isn’t a Tree, Forward, or Backward edge, i.e. non-ancestry relationship.
(I think thinking of them as siblings or cousins is a helpful way of distinguishing, because neither of the nodes are descendants/ancestors of one another in a cross-edge relationship.)
You have a directed graph G and see an edge e=(zw), and that w is a direct child of z. You run DFS and see that the post(z) > post(w); what type of edge is this?
Tree edge (or Forest Edge, more generally)
You have a directed graph G and see an edge e=(zw). You run DFS and see that the post(z) < post(w), what type of edge is this?
Back edge
Note: This was actually one of our poll questions.
If the post order number is going DOWN, what type of edge(s) might this be?
Tree edge, forward edge, or a cross edge. If the post order number is going down and the relationship is a direct parent/child, then it’s a tree edge. Otherwise it must be skipping down generations to a descendant, in which case it is a forward edge. If there’s no ancestor relationship (i.e. the nodes are siblings or cousins), then it is a cross edge).
If the post order number is going UP, what type of edge(s) might this be?
Back edge
What is a tree?
A tree is an undirected graph that is connected and acyclic.
See DPV pg. 129 for good info on tree properties.
If we have an undirected graph that is connected and acyclic, what is this called?
A tree.
See DPV pg. 129 for good info on tree properties.
A tree on n nodes has how many edges?
n - 1
See DPV pg. 129 for good info on tree properties.
If we have a connected, undirected graph G=(V,E) with |E| = |V| - 1, what type of graph is this?
A tree.
See DPV pg. 129 for good info on tree properties.
An undirected graph is a tree i.f.f. there is a unique path between any pair of nodes? (T/F). Why or why not?
True. In a tree, any two nodes can only have one path between them; for if there were two paths, the union of these paths would contain a cycle.
We can tell whether a connected, undirected graph is a tree or not just by counting how many edges it has? (T/F)
True. If it has n - 1 edges, then this must be the case.
See DPV pg. 129 for good info on tree properties.
When doing topological sort, we order the vertices by ________ _________ order number
decreasing; post
What is the main idea of Dijkstra’s algorithm?
It’s a greedy algorithm. It attempts to find the global optimum by with the aid of the local optimum. Given an input node, Dijkstra’s algorithm initially assumes an infinite distance between other nodes, slowly iterates and updates each node with the local best path and the best path found so far.
Dijkstra’s algorithm at start assumes an infinite distance between nodes? (True/False)
True
What is one of the principle advantages of the Bellman-Ford algorithm over Dijkstra’s algorithm?
Bellman-Ford can handle positive and negative edge weights, whereas Dijkstra’s can only handle positive edge weights.
What data structure is used for Dijkstra’s algorithm?
A heap (in this class we typically assume min-heap)
Bellman-Ford algorithm is slower than Djikstra’s algorithm? (T/F)
True
What is one of the key ideas of the Bellman-Ford algorithm?
The idea of “Relaxation” of edge weights. It does this for all edges in the graph at the same time for V - 1 iterations (and then it does one more iteration to detect negative cycles). This gives us a runtime complexity of O(n*m).
What is the runtime of the Bellman-Ford algorithm?
O(n*m), where n = |V| and m = |E|
When running the Bellman-Ford algorithm, if we’ve run V - 1 iterations, and then run a final V’th time and the cost DECREASES, what can we conclude?
There MUST be a negative weight cycle present.
Dijkstra’s algorithm is a _______ approach, whereas Bellman-Ford algorithm is a _______ approach?
greedy; dynamic programming
How many iterations are there in the Bellman-Ford algorithm? How many are there if we include checking for negative weight cycles?
|V| - 1, i.e. m - 1 if we don’t check for negative weight cycles. If we do check, then we simply perform one more iteration, so a total of |V| iterations.
Bellman-Ford’s algorithm can only handle positive edge weights? (T/F)
False. It can also handle negative edge weights, which is one of it’s principle advantages over Dijkstra’s algorithm.
Dijkstra’s algorithm can only handle positive edge weights? (T/F)
True
What are some algorithms for finding a path between two vertices in a graph such that the same of the edge weights are minimized (i.e. shortest path algorithms)?
- Dijkstra’s algorithm
- Bellman-Ford algorithm
3 A* Search - Floyd-Warshall algorithm
Note: I don’t think A* is mentioned in the lectures or textbook, so I don’t think it is available as a black box to us.
What type of data structure does BFS use?
Queue, i.e. FIFO, like a line at Disney World
What type of data structure does DFS use?
Stack, i.e. LIFO, like a stack of dishes
What is the SPACE complexity of BFS vs. DFS?
BFS: O(n), because we have to use a queue that ends up upper bounded by the number of leaf nodes.
DFS: O(log n), because we use a stack data structure that only has to maintain up to the HEIGHT of the tree
In general, which type of graph search is preferred, DFS or BFS? Why? What are some exceptions?
Generally DFS, because it in general has better space complexity and has some other nice properties. This doesn’t mean we should avoid BFS; but I think for this class we’re likely to see more usage of DFS, whether on its own or under the hood of other algorithms (e.g. SCC, etc).
AI/robot motion is a good example of an exception to this where we might prefer BFS over DFS. This is because BFS goes layer-by-layer, so it can be better for discovering the LOCAL ENVIRONMENT.
What are the input(s) for BFS?
- Graph G = (V, E), directed or undirected
2. Source vertex s in V
What are the output(s) for BFS?
- dist[u]: array containing distance from source s to u IF s can reach u; inf otherwise
- prev[z]: array containing the parent index of vertex z
Source: Joves’ notes, HW4 practice problems
What is the runtime of BFS?
O(n + m), i.e. linear
In BFS, the dist array contains the sum of the weights on the path from source s to destination u? (T/F)
False. dist[u] is the given as the NUMBER OF EDGES from s to u, NOT the sum of the weights. Very important to remember this.
Source: Joves’ notes
What are the input(s) to Dijkstra’s Algorithm?
- Graph G=(V, E), directed or undirected, with POSITIVE EDGE WEIGHTS c_e in E
- Source vertex s in V
What are the output(s) of Dijkstra’s Algorithm?
- dist[u]: array containing distance from source s to u IF s can reach u; inf otherwise
- prev[z]: array containing the parent index of vertex z (can be used to reconstruct the shortest path)
Source: Joves’ notes, HW4 practice problems
What is the runtime of Dijkstra’s algorithm?
O( (n + m)*log n)
BFS can be used to determine the weighted Single Source Shortest Path (SSSP)? (T/F). Why or why not?
False. Because the “shortest path” given by BFS is in terms of the NUMBER OF EDGES, and NOT the edge weights. We would need to use something like Dijkstra’s algorithm for weighted edges (assuming the edge weights are positive).
Dijkstra’s algorithm can be used for non-negative, weighted Single Source Shortest Path (SSSP)? (T/F)
True.
What is one connection between BFS and Dijkstra’s algorithm?
They can both be used for Single Source Shortest Path (SSSP) determination. However, BFS ONLY works for unweighted graphs, whereas Dijkstra’s can handle (positive) weighted edges.
If BFS and Dijkstra’s Algorithm can both be used for SSSP determination, why is the runtime for BFS O(n + m), but the runtime for Dijkstra’s Algorithm O( (n + m)*log n) ?
SSSP: Single Source Shortest Path
Because BFS can only handle SSSP for UNWEIGHTED graphs, whereas Dijkstra’s handles graphs with (positive) edge weights. If you’re doing extra work to process weights, then you can’t achieve linear O(n + m) runtime, hence the difference between the two.
What are the input(s) for DFS?
- Graph G=(V, E), directed or undirected
Source:
Joves’ Notes
What are the output(s) of DFS?
Hint: there are 4, and they are all arrays
- array prev[z]: the parent index of vertex z in the DFS visitation
- array pre[z]: the pre-order number of vertex z in the DFS visitation
- array post[z]: the post-order number of vertex z in the DFS visitation
- array ccnum[z]: the connected components number of vertex z. Can also be the strongly connected component number (SCC) IF the vertices are passed in topologically sorted, i.e. as highest post number –> lowest post number AFTER running DFS on a REVERSED DIRECTED GRAPH. For a directed graph that is processed this way, the ccnum will also be the topological sort order in reverse.
Source:
Joves’ Notes
What is the runtime of DFS?
O(n + m)
DFS can be used in an undirected graph to determine how many components there are? (T/F)
True. This would come from the max value for the ccnum
Source: Joves’ Notes
DFS can be used in an directed graph to determine how many components there are? (T/F)
False. This only works for UNDIRECTED graphs. For a directed graph, you would need to use something like SCC to do this.
Source: Joves’ Notes
How do pre/post order numbers behave on a DIRECTED graph?
They give information on how a graph COULD be explored. Some numbers might be interchangeable, others might not (see DPV 3.3(d) for a good example f this).
Source: Joves’ Notes
How do pre/post order numbers behave on an UNDIRECTED graph?
They give information on how a graph WOULD be explored GIVEN a particular starting point. A different root node might give an entirely different pre/post order numbering.
Source: Joves’ Notes
In DFS, the ccnum in a directed graph only works if you always explore from a sink vertex if all previous sink SCCs were removed? (T/F)
True. Keep in mind that a source vertex will be able to reach all connected vertices and will thus be more or less useless.
TODO: Need to review the SCC section again, as this still isn’t totally clear to me.
A DAG will only have one valid topological ordering? (T/F).
False. See DPV 3.3(d) for a good example of why this is the case. TL;DR version: post order numbering on a DIRECTED graph provides information on how a graph COULD be explored; it might admit multiple valid topological orderings.
A DAG will have at least one valid topological ordering? (T/F)
True.
A DAG will have at least one valid topological ordering? (T/F)
True.
What are the input(s) to the Topological Sort Algorithm?
- DAG
Source: Joves’ Notes
What are the output(s) of the Topological Sort Algorithm?
- array topo[i]: the vertex number of the i’th vertex in topological order from left to right, source to sink, (i.e. in DESCENDING post order number)
Source: Joves’ Notes
What is the runtime of the Topological Sort Algorithm?
O(n + m)
Source: Joves’ Notes
How does the Topological Sort Algorithm work?
It works by running DFS on a DAG and using the post order numbers to sort the vertices from highest to lowest.
What order will post order numbers be returned in from the Topological Sort algorithm?
In terms of decreasing post order number, i.e. from Source to Sink, which is equivalent to having all edges from left to right in the DAG.
The topological ordering comes along as a free ride when using the DFS algorithm? Does it need to be manually sorted or not if doing it this way?
True. No, it doesn’t require manually sorting. Do an example by hand to see that this is true. (This is also part of why we can achieve an O(n + m) runtime, because if we had to sort that would be a minimum of O(nlogn) )
What are the input(s) to the Explore Algorithm?
- Graph G = (V, E), directed or undirected
- vertex v in V to explore from
Source: Joves’ Notes
What are the outputs(s) of the Explore Algorithm?
- array visited[u] is set to “True” for all nodes u reachable from v
- Any other data structure that DFS has, if needed
Source: Joves’ Notes
What is the runtime of the Explore Algorithm?
- O(m) if run as part of DFS
- O(n + m) if run by itself and visited[] array needs to be created
Source: Joves’ Notes
The runtime of the Explore Algorithm is O(n + m)?
True, although not a comprehensive answer. This is True IF explore is run by itself which requires the visited[] array to be created. If run as part of DFS, then explore is O(m)
All of the outputs from DFS are available to the Explore algorithm as well? (T/F)
True. (The ccnum, prev, pre, and post, are all actually set in this subroutine; see DPV 3.2.2)
When might it be better to use Explore as our black-box algorithm rather than the full DFS algorithm?
If we only need information about a single source vertex, then it’s going to be better to use explore since it provides us with all the same outputs as DFS, but can isolate to only the source vertex of interest instead of running over all vertices)
The explore algorithm runs on all of the vertices and edges that are reachable from the input vertex v? (T/F)
True.
Source: Joves’ Notes
What is the runtime of the Floyd-Warshall algorithm?
O(n^3)
Compare Dijkstra vs. Bellman Ford vs. Floyd-Warshall Algorithm. When might we want to use each one? What are some of the pros and cons?
All of these are examples of algorithms for FINDING SHORTEST PATH(s)
- Dijkstra: simple algorithm, and the fastest option at O( (n + m)*log(n)); however, can only use it IF all the edge weights are positive)
- Bellman-Ford: Good to use if we have negative edge weights and are only interested in the Single Source Shortest Path (SSSP), which gives us a runtime of O(nm).
- Floyd-Warshall: Good to use if we have negative edge weights and are interested in All Pair Shortest Path (APSP); the tradeoff is that this is the slowest order at O(n^3)
Joves’ notes also mention that while Bellman-Ford and Floyd-Warshall are both algorithms available to us as black-boxes, they are technically DP solutions, so think twice about whether using them is really warranted before you go off writing a solution using them).
Source: Joves’ Notes
What are the input(s) to the Strongly Connected Components (SCC) algorithm?
- Graph G=(V, E), directed. (Crucial to remember that SCC is for DIRECTED graphs!!!)
Source: Joves’ Notes
What are the output(s) of the Strongly Connected Components (SCC) algorithm?
- G_scc = (V_scc, E_scc), where G_scc is a meta graph (a DAG) with each SCC in G forming a vertex in V_scc and each edge between SCCs in E_scc
- ccnum[.] comes from DFS used underneath the hood of this algorithm
- V_scc will look like 1, 2, 3, 4… which are the ccnums
- E_scc will look like (1, 2), (2, 3), (3, 4), … which are edges between V_scc
Source: Joves’ Notes
If we run SCC on some graph G, and we have two vertices u and v that we want to find which SCC u and v are in, how might we do this?
Look up ccnum[u] and ccnum[v]
Source: Joves’ Notes
Your examining some code that uses the SCC algorithm, and you notice a line with the conditional statement “if ccnum[u] == ccnum[v]: “. What is this conditional statement checking for?
It is checking to see if u and v are part of the same strongly connected component.
What is the runtime of the SCC algorithm?
O(n + m)
What algorithm is used under the hood of SCC? How many times is it run? Why is it run multiple times?
DFS. Twice. Because we have to run on both G and the reverse of it G_r
Every directed graph is a DAG of its strongly connected components? (T/F)
True.
See property on DPV 3.4.1, pg. 92.
What is the proper conception of the idea of connectivity on a DIRECTED graph?
Two nodes u and v of a directed graph are connected if there is a path from u to v AND a path from v to u.
See DPV 3.4.1, pg. 91.
Describe how the SCC algorithm works?
- Run DFS once with pre/post order numbering on reverse graph G_r
- Sort V in descending post order number (This gives sink to source)
- Run DFS again on G with V sorted
- Output will have ccnum representing SCC with higest = source, lowest = sink
- Use the ccnum to gather up vertices belonging to each SCC
- Check all the original edges and their endpoints. If the endpoints are in different SCCs, and a corresponding E_scc between those SCCs does not already exist, we add a E_scc to represent the edge from one SCC to another.
Source: Joves’ Notes
In the SCC algorithm, the output with the highest ccnum is a ______ and the lowest is a ______?
Source; Sink
What are the input(s) to the 2SAT algorithm?
- Conjunctive normal form (CNF) formula f, which contains n variables (x1, x2, … x_n), m clauses (x1 v x2) ^ (x3 v !x4)…
Each clause has at most two literals (and for this course, we can assume that it is EXACTLY two literals).
Source: Joves’ Notes