Lecture 15-17 - Dynamic Programming Flashcards
Name three key differences between Greedy and Dynamic Programming paradigms
Greedy
o Build up a solution incrementally.
o Iteratively decompose and reduce the size of the problem.
o Top-down approach.
Dynamic programming:
o Solve all possible sub-problems.
o Assemble them to build up solutions to larger problems.
o Bottom-up approach.
Define the optimal sub-structure mathematically
Let Sij = subset of activities in S that start after ai finishes and finish before aj starts. Sij = {ak ∈ S :∀i, j fi ≤ sk < fk ≤ sj} • Aij = optimal solution to Sij • Aij = Aik U { ak } U Akj
How many sub-problems, and choices to consider, are there in the activity selection problem before and after Greedy choice?
Before theorem
Pick the best m such that Aij = Aim U { am } U Amj
Subproblems: 2
Choices: j-i-1
After theorem:
Choose am∈Sij with the earliest finish time (greedy choice)
Sub-problems: 1
Choices to consider: 1
Define the Greedy choice theorem:
Theorem:
Let Sij ≠ ∅, and let am be the activity in Sij with the
earliest finish time: fm = min{ fk : ak ∈Sij}. Then:
1. am is used in some maximum-size subset of
mutually compatible activities of Sij.
2. Sim = ∅, so that choosing am leaves Smj as the only
nonempty subproblem.
What is the input and output in weighted interval scheduling?
Is the greedy choice always effective in this problem?
Input: Set S of n activities, a1, a2, …, an.
– si = start time of activity i.
– fi = finish time of activity i.
– wi = weight of activity i
• Output: find maximum weight subset of mutually compatible activities.
Greedy choice isn’t always effective.
Define Binary choice mathematically in terms of Opt(j) and p(j).
OPT(j) = value of the optimal solution to the problem
p(j) = largest index i < j such that activity/job i is
compatible with activity/job j.
Opt(j) =
0 if j = 0
max {wj + OPT(p(j)), OPT(j-1)} otherwise
Define memoization.
Memoization: Cache results of each subproblem; lookup as needed.
for j = 1 to n
M[j] ← empty.
M[0] ← 0.
M-Compute-Opt(j) if M[j] is empty M[j] ← max(v[j]+M-Compute-Opt(p[j]), M-Compute-Opt(j–1)). return M[j].
Prove that the memoized version of Binary choice takes O(nlogn) time.
Sort by finish time: O(nlogn) Computing p(): O(nlogn) via sorting by start time
M-Compute-opt(j): O(n)
each invocation takes O(1) time and either:
1. returns existing M[j]
2. fills in one new entry M[j] and makes two recursive calls (at most 2n recursive calls)
Remark: O(n) if jobs are presorted by start and finish times.
What’s the main idea of dynamic programming (in words)? How is this used in the Bottom-up algorithm?
Solve the sub-problems in an order that makes sure when you need an answer, it’s already been computed.
When we compute M[j], we only need values M[k] for k < j
BOTTOM-UP (n;s1,…,sn;f1,…,fn;v1,…,vn)
Sort jobs by finish time so that f1≤f2≤…≤fn.
Compute p(1), p(2), …, p(n).
M[0]←0
for j = 1 TO n
M[j] ← max { vj + M[p(j)], M[j–1] }
How many recursive calls are there in the Find-Solution algorithm?
of recursive calls ≤ n ⇒ O(n).
Do you remember how the reconstruction works (table example Lec 15)
Yes
Define the shortest path u to v in terms of weight w().
w (p) = min {w(p) : u -> v} if path exists
inf otherwise
What type of queues does Dijkstra’s algorithm use?
Are negative-weight edges allowed?
What type of keys does each node hold?
Is it dp or greedy choice?
Why is re-insertion in queue not a good idea to deal with negative weight edges? Why is adding a constant also not a good idea? give an example.
priority queue.
No negative weighted edges.
Keys are shortest-path weights (d[v])
Greedy.
Reinsertion -> exponential running time
Constant -> doesn’t always work, see lect 16
How is the Bellman-Ford algorithm different than Djikstra’s?
it allows negative-weight edges.
How does Bellman-Ford detect negative weight cycles?
If Bellman-Ford has not converged after V(G) - 1
iterations, then there cannot be a shortest path tree,
so there must be a negative weight cycle.
Returns TRUE if no negative-weight cycles
reachable from s, FALSE otherwise.
What is the time complexity of Bellman-Ford? is it larger than djisktra’s?
O(VE)
Yes, because we relax much more often than in djikstra’s.
Express bellman ford in terms of dynamic programming where d(i, j) = cost of the shortest path from s to i that is at most j hops.
d(i,j) = 0 if (i = s) and (j = 0) inf if (i =/= s) and (j = 0) min({d(k, j–1) + w(k, i): i E Adj(k)} or {d(i, j–1)}) if j > 0
Why do loop V(G) - 1 times in bellman ford?
Explore all potential paths with all potential lengths.
Is any greed algorithm optimal for the knapsack problem?
no, none of them are.
What variable did we introduce in the knapsack problem to make it work?
The weight limited value, that is updated every time an item is selected.
Define the knapsack problem mathematically using OPT(i,w).
OPT(i,w) =
0 if i = 0
OPT (i-1, w) if wi > w
max { OPT(i-1, w), vi + OPT (i-1, w - wi) } otherwise
What’s the time complexity of the knapsack problem with dp?
OMEGA(n * W).
for each n, we iterate through w=1 to W.
Do you remember how to do the bellman ford table?
Yes, lec 16 p 27.
Rows = iterations
Columns = node weights
every iteration, we relax all edges and enter the shortest path cost in cell.
Do you remember how to do the knapsack problem table?
Yes, lec 16, p42.
Rows = bag with increasing items
Column = weight limit 1 -> W
Define Pairwise Sequence alignment.
Let a=a1…am and b=b1…bn be two sequences over an alphabet Σ (i.e. a, b ∈Σ*). A pairwise alignment is a mapping f of the letters of a to b, such that if f(ai,bj) and f(ak,bl) then i
In pairwise sequence alignment, define: Match Substitution Insertion Deletion
Match: letters are identical
Substitution: letters are different
Insertion: a letter of b is mapped to the empty character
Deletion: a letter of a is mapped to the empty character
Let a=a1…am and b=b1…bn be two sequences over an alphabet Σ.
What are the three possibilities for counting alignment of:
(a1 … am)
(b1 … bn)
How can we rewrite this in terms of c(m,n)?
- (a1 … am-1) (am)
(b1 … bn ) ( _ ) - (a1 … am-1) (am)
(b1 … bn-1 ) (bn)
(a1 … am ) ( _ )
(b1 … bn-1 ) (bn)
c(m,n) = c(m-1,n) + c(m-1,n-1) + c(m,n-1)
Given c(m,n) = c(m-1,n) + c(m-1,n-1) + c(m,n-1) and initialization c(0,n)=c(m,0)=c(0,0)=1, what’s the recursive evaluation value of c(2,2)
13
Give a dp algorithm to compute all indices of c(m,n) = c(m-1,n) + c(m-1,n-1) + c(m,n-1).
What’s the complexity of this approach?
for i=0 to m do{
for j=0 to n do
c(i,j) = c(i-1,j)+c(i-1,j-1)+c(i,j-1)
}
What are the three remarks we learned about the knapsack problem?
- Pseudo-polynomial in input size
- Decision version of knapsack problem is NP-Complete.
- There exists a poly-time algorithm that produces a feasible solution that has value within 1% of optimum.
Define Levenshtein distance.
Calculate it for the following:
ABB_CEE
_BBCCDE
The Levenshtein Distance between two words/sequences is the
minimal number of substitutions, insertions and deletions to transform one into the other
1 deletion + 1 insertion + 1 substitution ⟹ d=3
Define Edit Cost and Edit Distance.
What is the application of these?
Edit Cost
Let δ(x,y) be a cost function for each edit operation (match,
substitution, insertion, deletion). The edit cost of two
words/sequences is the sum of the cost of each edit operation used transform one into the other.
Edit Distance)
The edit distance between two words/sequences is the minimal cost to transform one string into another. Generalization of the
Levenshtein distance.
Applications: Maximize the edit cost if higher values represent a similarity, minimize if you use an edit distance.
Calculate the edit distance of
ABB_CEE
_BBCCDE
for d(x,y) = 0 if x = y, 1 otherwise
Do it again for
1 if x = y
-1 if x =/= y and x =/= _ or y =/= 0
-2 if x = _ or y = _
4 match+1 deletion+1 insertion+1 substitution
⟹ d = 4 * (0) + 1 * (+1) + 1 * (+1) + 1 * (+1) = 3
4 match+1 deletion+1 insertion+1 substitution
⟹ s = 4 * (1) + 1 * (-2) + 1 * (-2) + 1 * (-1) = -1
What assumptions are we holding for Edit distance to be a metric?
• Every edit operation has positive cost
• for every operation, there is an inverse operation
with equal cost
Prove the following:
“A sub-alignment of an optimal alignment w.r.t. the edit cost/distance is also optimal”
Proof: cut-and-paste argument & contradiction
• Let A be an optimal alignment
• Let A = A1A2A3 be a decomposition of A such that A2 is not optimal.
• Let A’2 be an optimal alignment of the substrings in A2
• Substitute A2 by A’2 to build a new alignment A’
• δ(A’) = δ(A1A’2A3) = δ(A1)+δ(A’2)+δ(A3) < δ(A1)+δ(A2)+δ(A3) = δ(A1A2A3) = δ(A)
• contradiction with A optimal
What are the three cases for the Problem Structure of the following:
d(i,j) = minimal cost of aligning prefix strings a1…ai and b1…bj.
Case 1 (ai matches bj) cost of matching ai with bj + min cost of aligning a1…ai-1 and b1…bj-1. d (i,j) = delta(i,j) + d(i-1, j-1)
Case 2a (deletion of ai) cost of deletion of ai + min cost of aligning a1…ai-1 and b1…bj. d (i,_) = delta(i,_) + d(i-1, j)
Case 2b (insertion of bj) cost of insertion of bj + min cost of aligning a1…ai and b1…b-1j. d (_,j) = delta(_,j) + d(i, j-1)
Define the Recursion solution to string alignment mathematically in terms of d(i,j)
d(i,j) = j * delta (_, *) if i = 0 i * delta (*, _) if j = 0 min of: d (i,j) = delta(i,j) + d(i-1, j-1) d (i,_) = delta(i,_) + d(i-1, j) (_,j) = delta(_,j) + d(i, j-1) otherwise
What’s the strategy to solve the string alignment problem using dp?
What does a table representing this look like?
What’s the name of the algorithm that solves this?
You only need to know the solutions of smaller (sub-)
alignments to compute a new one. Fill the dynamic array
following this partial order.
We define a partial order on the (sub-)alignments Amn such that
Amn <= Am’n’ if m<=m’ and n<=n’.
See slide 20 lec 17 for this table.
Needleman-Wunch Algorithm for i=0 to m do .......d(i,0)=i*δ(-,-) for j=0 to n do .......d(0,j)=j*δ(-,-) for i=1 to m do .......for j=1 to n do ..............d(i,j) = min(d(i-1,j)+δ(ai,-), ..............d(i-1,j-1)+δ(ai,bj), ..............d(i,j-1)+δ(-,bj)) return d(m,n)
Apply the Needleman-Wunch algorithm on a = ATTG b = CT if delta(x,y) = 0 if x = y, 1 otherwise.
Show the corresponding table and final edit cost.
How do we backtrack?
d - A T T G
.- 0 1 2 3 4
C 1 .1 2 3 4
T 2 2 1 2 3
Backtracking
• Each move is associated to one edit operation
• Vertical = insertion
• Diagonal = match/substitution
• Horizontal = deletion
• We use one of these 3 move to fill a cell of the array
• From the bottom-right corner (i.e. d(m,n)), find the move that has been used to determine the value of this cell.
• Apply this principle recursively.
Prove the following theorem:
The dynamic programming algorithm computes the edit distance (and optimal alignment) of two strings of length m and n in Omega(mn) time and Omega(mn) space.
Proof:
• Algorithm computes edits distance.
• Can trace back to extract an optimal alignment
Can we avoid using quadratic space to compute Needleman-Wunch?
What do we sacrifice in order to do this?
Easy to compute optimal value in Omega(mn) time and Omega(m+n)
space.
• Compute OPT(i,⦁) from OPT(i-1,⦁).
• But, no longer easy to recover optimal alignment itself, only value.