L18-19 (Prim's, Set Cover, Dynamic Programming) Flashcards

1
Q

Define Prim’s algorithm

A

Prim’s algorithm is a greedy algorithm for finding the minimum spanning tree (MST) of an undirected, connected, and weighted graph.

The algorithm starts with an arbitrary vertex and iteratively adds the lightest edge that connects the MST to the remaining vertices, ensuring that no cycles are formed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What makes Prim’s algorithm a greedy algorithm?

A

At each step, prim’s algorithm selects the lightest edge that connects the MST to the remaining vertices.

This choice is always part of an optimal solution (i.e., the MST) because it adheres to the cut property, ensuring that the algorithm converges to the global minimum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s the cut property in MSTs? Why does it make the greedy MST algorithms correct?

A

The cut property states that given a cut (a partition of the graph’s vertices into two non-empty sets), the lightest edge crossing the cut (connecting the two sets) must be part of the MST.

This property ensures that the greedy choice of selecting the lightest edge at each step is always part of an optimal solution. The cut property is the foundation for the correctness of greedy MST algorithms, such as Kruskal’s and Prim’s algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does Prim’s algorithm use a priority queue? How is it updated?

A

Prim’s algorithm uses a priority queue to efficiently select the lightest edge connecting the MST to the remaining vertices.

The priority queue is initialized with all vertices (except the starting vertex) with their keys set to infinity and the starting vertex key set to 0.

During the algorithm, the priority queue stores the vertices not yet in the MST, with their keys being the minimum edge weight connecting them to the MST.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Give detailed algorithm steps on how prim’s algorithm runs?

A
  1. Initialize an empty set for the MST, a set for visited vertices, and a priority queue with all vertices (except the starting vertex) with their keys set to infinity and the starting vertex key set to 0.
  2. While the priority queue is not empty, perform the following steps:
    a. Select the vertex (u) with the smallest key that has not been visited, mark it as visited, and add the corresponding edge (if any) to the MST.
    b. For each neighbor (v) of u, if v is not visited AND the weight of the edge (u, v) is less than the current key of v, update the key of v in the priority queue.
  3. Repeat step 2 until all vertices are visited and the MST contains exactly |V| - 1 edges, where |V| is the number of vertices in the graph.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give time complexity analysis for Prim’s

A

The time complexity of Prim’s algorithm depends on the data structure used for the priority queue:

Array: O(V^2)
Binary Heap: O((V+E) log V)
Fibonacci Heap: O(V Log V + E)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Horn formulas

A

Formally: Horn formulas are conjunctions of horn clauses

A Horn formula is a propositional formula in conjunctive normal form (CNF), where each clause contains at most one positive literal. Horn formulas are significant in the context of logic programming and automated theorem proving, as they can be efficiently checked for satisfiability using linear-time algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the satisfiability problem for Horn formulas?

A

Given a horn formula, can I assign true/false to each variable such that all clauses resolve to true?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In Horn formulas, there are two types of clauses. What are they?

A
  1. Implications (A => B)
  2. Pure negative clauses which is an “or” of any number of negative literals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

For Horn formulas, the Unit propagation algorithm is used to solve the logical satisfiability problem in O(n) time.

Describe this step by step

A
  1. Set all variables to false
  2. While there is an implication clause that is not satisfied:
    a. set the right-hand variable of the implication to true (only if the formula forces it to be true)
  3. If all pure negative clauses are satisfied, then return the satisfying assignment. Otherwise return “formula unsatisfiable”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we know that the unit propagation algorithm works?

A

The algorithm works because it follows the basic rules of Boolean algebra and propositional logic. By iteratively applying the rules, it simplifies the formula and helps identify whether it is satisfiable or not.

If a variable is set to true by the greedy algorithm, then it must be true in any satisfying assignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Set Cover problem?

A

The Set Cover problem involves selecting a minimum number of sets from a collection of sets, such that the union of these selected sets contains all the elements in the universal set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Give the formal definition of the Set Cover problem

A

Given a universe U of elements and a collection S of subsets of U (S is a set of sets), find the smallest sub-collection C (subset of S) such that the union of the sets in C contains all elements in U.

In other words, whats the minimum number of sets within S that you can choose such that all elements in U are included

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give the greedy algorithm for the set cover problem. What’s the runtime? Does this give the optimal solution?

A
  1. Initialize an empty set C for the chosen sets.
  2. While there are uncovered elements in U:
    a. Select the set S in the collection with the maximum number of uncovered elements.
    b. Add S to C and remove the covered elements from U.
  3. Return the set C as the approximate set cover.

The runtime is O(k ln n) where k is the number of sets in the solution

This gives an approximation algorithm, which is ln(n) away from the optimal (n is the number of points)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the framework for Dynamic Programming in three steps?

A
  1. Define the structure of the subproblems:
    Identify the subproblems that make up the original problem. Break the problem down into smaller, overlapping subproblems that can be solved independently. A table, matrix, array, etc can be used to store solutions to the subproblems
  2. Find the recurrence relation:
    Determine the relationship between the subproblems and how the solution to the larger problem depends on the solutions to the smaller subproblems.
  3. Determine the order of computation and solve the problem:
    This is typically done by either a bottom-up (iterative) approach or a top-down (memoization) approach. In the bottom-up approach, you start by solving the smallest subproblems and iteratively build up the solution to the larger problem. In the top-down approach, you start with the original problem and recursively break it down into smaller subproblems, storing the solutions to these subproblems to avoid redundant computation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give a Dynamic Programming Algorithm to solve the single-source shortest paths algorithm on a weighted DAG.

What’s the time complexity of this algorithm?

A
  1. Perform a topological sort on the DAG to order the vertices in a linear sequence such that for every directed edge (u, v), vertex u comes before vertex v in the sequence.
  2. Initialize a mapping “dist” for each vertex, of size |V| (number of vertices). Set all vertices’ value to infinity, except for the source vertex, which is initialized with 0. The dist mapping will store the shortest distances from the source vertex to all other vertices.
  3. For each vertex u in the topologically sorted sequence (starts from source node), perform the following steps:
    a. For each adjacent vertex v of u (i.e., for each edge (u, v)):
    i. Calculate the distance from the source vertex to v through u: new_dist = dist[u] + weight(u, v), where weight(u, v) is the weight of the edge (u, v).
    ii. Update the shortest distance for vertex v if new_dist is shorter than the current shortest distance: dist[v] = min(dist[v], new_dist).
  4. The mapping dist now contains the shortest path distances from the source vertex to all other vertices in the DAG.

Time complexity is O(V + E)

17
Q

For shortest paths in a weighted DAG, what’s the difference between the Greedy algorithm and the DP algorithm?

Do they both produce optimal solutions?

A

Both algorithms begin by topologically sorting the vertices based on the edges.

For each iteration in which the distance is updated:
- Greedy algorithm updates the shortest path by choosing the adjacent vertex with the minimum current distance
- DP algorithm considers all adjacent vertices in the iteration

The greedy algorithm is not guaranteed to produce the correct shortest path in a DAG whereas the DP algorithm is.

18
Q

The Longest Increasing Subsequence (LIS) problem involves finding the length of the longest subsequence of a given sequence, such that all elements of the subsequence are sorted in increasing order.

What’s the DP algorithm for this? Runtime?

A
  1. Initialize the dp array with all elements set to 1, as each element itself is an increasing subsequence of length 1.
  2. For each element seq[i] in the input sequence (from index 1 to n-1), perform the following steps:
    a. For each previous element seq[j] (from index 0 to i-1):
    i. If seq[i] > seq[j], it means that seq[i] can be appended to the increasing subsequence ending at seq[j]. Update the dp array: dp[i] = max(dp[i], dp[j] + 1).
  3. The length of the longest increasing subsequence is the maximum value in the dp array.

Runtime is O(n^2)

19
Q

The Longest Increasing Subsequence (LIS) problem involves finding the length of the longest subsequence of a given sequence, such that all elements of the subsequence are sorted in increasing order.

What’s the Topological-sort algorithm for this? Runtime?

A
  1. Create a DAG with vertices representing the elements of the input sequence. Add directed edges between the vertices (i, j) if seq[i] < seq[j] and i < j.
  2. Perform a topological sort on the DAG to obtain a linear ordering of the vertices. This will give an array of the elements ordered by their dependencies (comes before, and is less than)
  3. Initialize an array dp of the same length as the input sequence. dp[i] represents the length of the longest increasing subsequence that ends with the element at index i.
  4. For each vertex u in the topologically sorted sequence, perform the following steps:
    a. For each adjacent vertex v of u (i.e., for each edge (u, v)):
    i. Update the dp array: dp[v] = max(dp[v], dp[u] + 1).
  5. The length of the longest increasing subsequence is the maximum value in the dp array.

Runtime is O(n^2)

20
Q

The Edit Distance problem involves finding the minimum number of edit operations required to transform one string into another. In other words, given strings A and B, whats the minimum number of edit operations to get from A to B. The edit operations considered are insertion, deletion, and substitution of characters.

Give a DP algorithm which solves this problem. Whats the runtime of this algorithm?

A
  1. Initialize the dp table with dimensions (m+1) x (n+1), where m and n are the lengths of the input strings str1 and str2, respectively.
  2. Set dp[0][j] = j for all j (0 <= j <= n), as transforming an empty string into a string of length j requires j insertions.
  3. Set dp[i][0] = i for all i (0 <= i <= m), as transforming a string of length i into an empty string requires i deletions.
  4. For each character str1[i] (1 <= i <= m) and str2[j] (1 <= j <= n), perform the following steps:
    a. If str1[i-1] == str2[j-1], the characters match, so no edit operation is required: dp[i][j] = dp[i-1][j-1].
    b. If str1[i-1] != str2[j-1], choose the minimum of the three possible edit operations:
    i. Insertion: dp[i][j-1] + 1.
    ii. Deletion: dp[i-1][j] + 1.
    iii. Substitution: dp[i-1][j-1] + 1.
    Set dp[i][j] to the minimum value.
  5. The edit distance between the two input strings is the value in the bottom-right cell of the dp table: dp[m][n].

The runtime of this algorithm is O(m*n) where m and n are the lengths of the input strings.

21
Q

What’s the Knapsack problem?

A

The Knapsack problem is a classic optimization problem in combinatorial optimization. Given a set of items, each with a weight and a value, the objective is to determine the maximum value that can be obtained by selecting a subset of items such that their total weight does not exceed a given weight limit (the capacity of the knapsack).

22
Q

Give the algorithm to solve the knapsack problem?

What’s the runtime?

A
  1. Initialize the dp table with dimensions (n+1) x (W+1), where n is the number of items and W is the weight limit of the knapsack.
  2. Set dp[i][0] = 0 for all i (0 <= i <= n), as no value can be obtained with a knapsack of capacity 0.
  3. Set dp[0][w] = 0 for all w (0 <= w <= W), as no value can be obtained without any items.
  4. For each item i (1 <= i <= n) and each capacity w (1 <= w <= W), perform the following steps:
    a. If the weight of the item weights[i-1] is less than or equal to the capacity w, calculate the value obtained by including the item in the knapsack: dp[i-1][w-weights[i-1]] + values[i-1].
    b. Calculate the value obtained by not including the item in the knapsack: dp[i-1][w].
    c. Set dp[i][w] to the maximum of the two possible values: dp[i][w] = max(include_item, exclude_item).
  5. The maximum value obtainable with the given weight limit is the value in the bottom-right cell of the dp table: dp[n][W].

Runtime is O(n*W) where n is the number of items and W is the weight limit of the knapsack