Graphs Flashcards
Weakly Connected
- Term applies to directed graphs
- A graph that isn’t strongly connected, but would be would be connected if it were undirected
Greedy Algorithm
- An algorithm that involves making the “optimal decision” at each step
- The “optimal decision” depends on the context of problem
- Shortest distance
- Earliest time, etc.
- Ex: Dijkstra’s algorithm
- Relaxing edges is greedy
- For each edge, you compare current distance, to distance through vertex v, and choose minimum
Simple Path
- A path with no cycles
- Also, a path with no repeated vertices
Strongly Connected Graph
- Term applies to directed graphs
- There exists a path from any vertex to any other vertex
Adjacency Matrix
- An implementation of a graph using a 2D matrix edges
- Width and height of matrix is |V|
- Cells represent edges, and the indices of the cells (the rows/cols) represent vertices
- Running time is good, but space complexity is poor
- Especially useful for dense graphs or when |V|2 (size of matrix) is really small
- Cannot be used to represent a graph with parallel edges
Complexity
Time
Checking for an edge: O( 1 )
Iterating over the graph: O( v2 )
Getting all the neighbors: O( v )
Space
O(V2)
Shortest Path Problems
- Problems involving finding the shortest paths between vertices in a graph
- Refers to weighted graphs (directed or undirected)
- Variations
- Single Source Shortest Path
- All Pairs Shortest Path
- Single Destination Shortest Paths
Tree
- Connected acyclic graph
Spanning Tree
- Applies to undirected graphs
- A tree within graph that contains all vertices
- A graph can have multiple spanning trees
- The set of vertices in both graph and tree are the same
Paranthesis Theorum
- When performing DFS on a graph, the discovery and finishing times never overlap
- Consider ud, uf, and vd, vf
- Either disjoint [ud, uf], [vf, vf]
- Neither is a descendent of the other
- Or u nested in v [ud, vd, vf, uf]
- V is a descendent of U
- Or v nested in u [vd, ud, uf, vf]
- U is a descendent of V
- Either disjoint [ud, uf], [vf, vf]
Simple Graph
- a graph that does not contain loops or parallel edges
Strongly Connected Component
SCC
- strongly connected subgraph in a directed graph
Minimum Spanning Tree
- A spanning tree of a graph that has a total weight no higher than any other spanning tree in the graph
- A graph can have multiple minimum spanning trees
- Applies to weighted, undirected graphs
- A MST of an unweighted graph is just any spanning tree
Adjacency List
- An implementation of a graph using an array of linked lists of vertices
- Each element in array represents a vertex, and stores a list all edges adjacent to that vertex
- In weighted graph, nodes of list at each index also store weights (from that index to vertex)
- The list stored at each index can also (optionally) be an array, a binary tree or some other data structure
Complexity
Time
Checking for an edge: O( V ) / O ( deg(v) ) / O( lg(V) )
Iterating over all edges: O( V + E )
Getting all the neighbors: O( V )
Space
O( V + E )
* O( deg(v) ) basically means “it depends”
* O( lg(V) ) is achievable is elements are sorted, perhaps in a BST, and you use binary search
Transpose
- Reversing the direction of every edge in a directed graph
Kosaraju’s Algorithm
- Used to find strongly connected components in a directed graph
- Runs DFS twice
Steps
- Run DFS over all vertices
- Use stack to organize nodes by decreasing finish time (similar to topological sort)
- Transpose graph
- Pop vertices, run DFS on each
- Vertices reachable from this second run comprise individual scc’s (strongly connected components)
Complexity
Time
Average ⇔ Worst: O(V + E)
Space
O(V + E)
*Linear in the size of the graph
Reachable
- If a path exists from one vertex to another, it is reachable
Relaxing an edge
- Term used in shortest path algorithms
- The process of continually improving the distance estimation to some vertex through different edges
Adjacency Matrix
Directed vs Undirected Graph
- In an undirected graph, matrix will be symmetrical across diagonal
- Therefore, you should only keep one half (usually the top half)
Single Source Shortest Path
- Finding the shortest weighted path between one vertex and evey other vertex
- Or finding the shortest weighted path between one vertex to another vertex
Algorithms
- Dijkstra’s
- A*
All Pairs Shortest Paths
- Finding the shortest weighted path from every vertex to every other vertex
Algorithms
- Floyd-Warshall
Dijkstra’s Algorithm
- Single source shortest path
- Used on weighted graphs (directed or undirected)
- Graph cannot contain negative edges
Steps
- Assign every node a tentative distance-to value of infinity
- Assign every node a parent of None
- Give source distance of 0
- Add source to the priority queue
- While queue is not empty
- Remove min-distance vertex v from priority queue
- Add source to seen set
- For each neighbor u of vertex v
- If u not in seen set (if u is still in pq)
- Relax edge from v to u
- If the distance-to v + edge(v, u) is less than current distance-to u
- Update distance-to u
- Change u’s parent to v
- Add u with priority to pq (or update priority of u in pq with new distance)
- Stop upon finding destination, or when queue is empty
Complexity
Time
Average ⇔ Worst: O( |E| + |V|*log |V| )
Average ⇔ Worst: O( (|E| + |V|) * log |V| )
Space
O(V)
*First time complexity assumes fibonacci heap (constant decrease-key)
*Second time complexity assumes binary heap (log delete-min and descrease-key)
Prim’s Algorithm
- Used to find Minimum Spanning Tree
- Very similar to Dijkstra’s
- * Dijkstra PRIME: for every vertex, Priority Is My Edge
- key of priority queue is a vertex’s edge, rather than distance from source
- Basically add smallest edges one by one until tree is formed
Steps
- Initialize key (associated edge) of every vertex to infinity
- Initialize parent of every vertex to None
- Initialize key of source as 0
- Add source to pq
- While size of tree is < |V|
- Remove min element
- Add vertex and edge to MST
- For each neighbor u of v
- If u not in MST
- If edge(v, u) < key(u)
- Make v parent of u
- Update key(u) in PQ
- Keep going until PQ is empty, or until size(MST) = |V|
Complexity
Time
Average ⇔ Worst: O( |E| + |V|*log |V| )
Average ⇔ Worst: O( (|E| + |V|) * log |V| )
Space
O(V)
*First time complexity assumes fibonacci heap (constant decrease-key)
*Second time complexity assumes binary heap (log delete-min and descrease-key)
Loop
- An edge from a vertex to itself
Adjacency List
Directed vs Undirected Graphs
Undirected Graphs
- Must store edges twice
Directed graph
- When representing a digraph with an adjacency list, the lists stored at each index can be implemented in different ways
3 Options
- Store list of outgoing edges
- Store list of incoming edges
- Store both in two separate lists
Depth First Search
- Assumes an adjacency list
- Cannot be used to find the minimum path
- Applications
- Topological sort
- Finding connected components in an undirected graph
- Finding SCC in a directed graph
Steps
- Initialize all edges as undiscovered, and remove predecessors
- Loop through all of nodes, starting depth first search on each node
- For each node
- Mark that node as discovered
- Mark “time” of discovery
- Process node
- Loop throughh vertex’s adjacent nodes
- If not discovered
- Update predecessor
- Perform dfs on that node (recursively)
- After loop
- Increment time again
- Update finish time
Complexity
Time
Average ⇔ Worst: O( |V| + |E| )
Space
O(V)
*Linear in the size of the graph
In-degree / Out-degree
In-degree
- The number of incoming edges a vertex has in a directed graph
Out-degree
- The number of outgoing edges a vertex has in a directed graph
Number of Edges/Vertices (Notation)
|V| = number of vertices
|E| = number of edges
- The number of vertices in set V and number of edges in set E
A* Pathfinding
- A modification of Dijkstra’s that considers an additional heuristic
- Only used to find paths to one location
- Heuristic should be ‘admissible’ (should never overestimate the true distance) in order to return a guaranteed shortest distance
- Instead of “distance_to”, we use different terminology
- f(n) = g(n) + h(n)
- f(n) => total estimated cost
- g(n) => equivalent to distance_to n
- h(n) => heuristic from n to target
Steps
- Create visited set
- Initialize g(n) of all nodes as infinity
- Initialize f(n) of all nodes to infinity
- Initialize parents of all nodes to None
- Update g(source) to be 0
- Updated f(source) of source to be its heuristic
- Add source to pq
- While queue is not empty
- Remove vertex v from pq with lowest f(n)
- Mark as visited
- If vertex is target, return
- For neighbors u of v
- If u not in visited
- Relax edge v to u
- If g(v) + edge(v, u) < g(u)
- Change parent of u to v
- Add/update priority of u with f(u) (new cost plus heuristic)
- Keep going until goal found (or queue empty)
Complexity
Time
Average ⇔ Worst: O( |E| + |V|*log |V| )
Average ⇔ Worst: O( (|E| + |V|) * log |V| )
Space
O(V)
*First time complexity assumes fibonacci heap (constant decrease-key)
*Second time complexity assumes binary heap (log delete-min and descrease-key)
*Real complexity depends on the heuristic used, so just used Dijkastra’s as an upper bound
Adjacency Matrix
Weighted vs Unweighted Graph
Unweighted Graph
- Cells should store boolean values to indicate the presence of an edge from i to j
Weighted Graph
- Cells should store number values (integer or floats) to represent the weight
- You can store None or a dummy weight to represent a missing edge
- Common dummy weights are -inf/inf or 0
- Optionally, you can store edge objects, which can contain information beyond a simple weight
Directed Graph
- Also called a digraph
- edges are ordered pairs, (x, y)
- edges (x, y) != edge (y, x)
- Bidirectional relationships require two edges
- Ex: the internet, instragram following
Multigraph
- A graph where nodes can be connected with more than one edge
- Put another way, multiple edges can run parallel between the same vertices
Degree
- The number of edges a vertex has
- lkl
- lklk
- klkl
- klkl
- lkl
- klkl
- klkl
- klkl
- klklj
Neighbors
- Neighbors of vertex v are all of its adjacent vertices
Maximum Number of Edges
In a directed graph with loops
- v2
In a directed graph without loops
- v(v - 1)
In an undirected graph with loops
- v(v + 1) / 2
In an undirected graph without loops
- v(v - 1) / 2
Cycle Detection in an Undirected Graph
- In an undirected graph, a cycle occurs when
- There is a neighbor of v that has been visited already, but isn’t a parent of v
- V is its own neighbor (loop)
- Can use BFS or DFS, but DFS makes it easier to describe the cycle
Steps
- Use DFS
- When exploring neighbors u of v
- If u has been visited, but isn’t a parent of v
- Or u is v (loop)
- Then a cycle exists, return True
Bellman Ford
- Single source shortest paths algorithm
- Used on weighted, directed graphs
- Indicates if negative weight cycle is found (returns boolean, raises error)
- Negative weights in a undirected graph will appear as cycles
- Effective but slow
- Does not require vertices to be marked as visited
- * The Bellman Drinks VEEVV+VEEVA:
- Visit Every Edge or Vertex V times, and Visit Every Edge or Vertex Again
Steps
- Initalize distance_to all vertices to inf
- Initalize parent of all vertices to null
- Initialize distance_to source to 0
- Create a loop of length |V|-1
- In each iteration, loop through and relax edges
- Or loop through vertices and relax their adjacent edges
- You can stop outer loop if relaxation doesn’t change tree
- To detect negative weight cycles, loop through graph again
- If edge weigts change, cycle must exist
- Return False or raise error
Complexity
Time
Best: O(|V| + |E|) ♦ Average ⇔ Worst: O(|V| * |E|)
Space
O(1)
*Best case: |V| to initialize distances/parents; |E| if you stop early and only relax edges twice (stop after second iteration because no changes were made)
Cut Vertex
- A vertex whose removal would create more connected components in the graph than there were before
Handshaking Lemma
- In any graph, the sum of the degree (or the neighbors) of all the vertices is:
- 2*|E| in an undirected graph
- |E| in a directed graph
Adjacent
- Vertices are adjacent if they share an edge
Floyd–Warshall
- All-pairs shortest paths
- Works with negative and positive edges, but no negative edge weight cycles
- Returns a table of weights, cannot get back paths
- * beKause It’s June, mIJ lIKs the KaJe
Steps
- Initialize 2-d dist array, size |V|*|V|, to inf in every cell
- Loop through edges, populating dist matrix with distances from u-v
- Loop through vertices, make distance to itself 0 (along the diagonal)
- Create triple nested loop, k, i, j to V
- If dist[i][j] > dist[i][k] + dist[k][j]
- Then dist[i][j] = dist[i][k] + dist[k][j]
Complexity
Time
Average ⇔ Worst: O( |V|3 )
Space
O( |V|2 )
*Running time dominated by triple nested loop
Connected Graph
- Term applies to undirected graphs
- There a exists a path from any vertex to any other vertex
Weighted Graph
- A graph in which edges have a “weight” associated with them
- Weights can be used to represent many things, included capacity of a wire, the traffic along a route, etc.
- Meaning of weights is based on what you’re modeling
Cycle Detection in a Directed Graph
- In directed graph, a cycle occurs when
- A neighbor has been visited and is on the path
- V is its own neighbor (loop)
Steps
- Use DFS while recording path
- When exploring neighbors u of vertex v
- If u is in the path
- Or u is v (loop)
- Or u has a discovery time but doesn’t have a finish time
- Then cycle exists, return True
Directed Acyclic Graphs
- Also known as a DAG
- It’s a directed graph with no cycles
Topological Sort
- Expects a DAG
- Provides a linear ordering of vertices
- Produces a partially ordered set, elements are ordered w.r.t. some, but not all other elements
- Ordering is not necessarily unique
Steps:
- Loop through all vertices, running DFS
- For each frame of DFS
- Record start time
- After visiting all of that vertex’s neighbors (after completing for loop)
- Push to stack
- Record finishing time
- At very end, pop and print
- Or sort by finishing time in decreasing order (latest comes first)
Complexity
TimeV
Best⇔ Average ⇔ Worst: O(|V| + |E|)
Space
O(|V|)
*Linear in the size of the graph
Edges
- Represent some relationship between vertices
- Ex: Facebook edges represent friendship
Shortest Paths in a DAG
- Applies to shortest weighted path
- Special procedure that can be performed in linear time (normal graphs might require Dijkstra’s or Bellman Ford, etc.)
- Negative edge weights allowed (no cycles tho)
- Similar to edge relaxation in pathfinding algorithms
Steps
- Initialize dist_to source to 0
- Initialize dist_to all other vertices to inf
- Perform topological sort
- Loop through vertices in topological order
- Use edge relaxation at each vertex, along incoming edges
- If dist[v] + edge(v, u) < dist[u], etc.
- or distance to from s to v is minimum of distances from s to u1 - un (incoming edges to v) and minimum of edges to v
Unweighted Graph
- A graph without weights
- Unweighted graphs can be thought of a special case of weighted graph in which all edges have same weight
- Normally assume weight is “1 unit”
Incident
Term has 2 meanings
- If two edges share a vertex, they are incident
- If an edge is connected to a vertex, then they are incident
Kruskal’s Algorithm
- Used to find Minimum Spanning Tree (or forest)
- Can get individual trees from forest by finding connected components before or after algo
- Requires a Union Find data structure
Steps
- Create empty MST set
- Use make-set on each vertex
- Create list of edges
- Sort list of edges by weight
- Iterate through edges (in increasing order)
- For each edge
- If find(u) != find(v) (if they aren’t in the same set)
- Add edge to MST set
- Union u and v
- Keep going until for loop ends
Complexity
Time
Average ⇔ Worst: O(E log V)
Space
O(|V| +|E|)
*Sorting dominated running time
*O(E log E) ⇒ O(E log V) because V2 = O(E) (because you have at most V2 edges
Single Destination Shortest Paths
- Finding the shortest path from every other vertex to one vertex
- Not done very often
- Basically the same as single source
- Reverse edge directions and treat destination as source
Breadth First Search
- Usually done on a adjacency list representation
- Start with a vertex, and explore its neighbors
- Algorithm can be tweaked used to find the minimum path
- Applications
- Guaranteed shortest path (path with fewest edges between vertices)
- Spanning tree
- In unweigted graphs, all spanning trees are “minimum spanning trees”
Steps
- Create an empty set of “visited” nodes
- Create an empty queue
- Add starting vertex to queue
- Add vertex to visited set
- While queue isn’t empty
- Dequeue and process vertex v
- Loop through v’s adjacent vertices
- If adjacent vertex is not in “visited” set
- Add to visited set
- And enqueue
Complexity
Time
Average ⇔ Worst: O( |V| + |E| )
Space
O(V)
*Linear in the size of the graph
Graph:
Definition
- A set of vertices and edges
- Usually implemented with an adjacency matrix or an adjacency list
- G = (V, E)
Hypergraph
- A graph where one edge can connect more than two nodes at once
Simple Cycle
- A cycle with no smaller cycles
Undirected Graph
- A graph where edges don’t have orientation
- edges are unordered pairs, {x, y}
- edge (x, y) is the same as edge (y, x)
- Ex: facebook friendships