Lecture 15-17 - Dynamic Programming Flashcards

Question

Define Pairwise Sequence alignment.

Answer 1

Let a=a1…am and b=b1…bn be two sequences over an alphabet Σ (i.e. a, b ∈Σ*). A pairwise alignment is a mapping f of the letters of a to b, such that if f(ai,bj) and f(ak,bl) then i

Answer 2

Match: letters are identical Substitution: letters are different Insertion: a letter of b is mapped to the empty character Deletion: a letter of a is mapped to the empty character

Answer 3

1. (a1 ... am-1) (am) (b1 ... bn ) ( _ ) 2. (a1 ... am-1) (am) (b1 ... bn-1 ) (bn) (a1 ... am ) ( _ ) (b1 ... bn-1 ) (bn) c(m,n) = c(m-1,n) + c(m-1,n-1) + c(m,n-1)

Answer 4

for i=0 to m do{ for j=0 to n do c(i,j) = c(i-1,j)+c(i-1,j-1)+c(i,j-1) }

Answer 5

1. Pseudo-polynomial in input size 2. Decision version of knapsack problem is NP-Complete. 3. There exists a poly-time algorithm that produces a feasible solution that has value within 1% of optimum.

Answer 6

The Levenshtein Distance between two words/sequences is the minimal number of substitutions, insertions and deletions to transform one into the other 1 deletion + 1 insertion + 1 substitution ⟹ d=3

Answer 7

Edit Cost Let δ(x,y) be a cost function for each edit operation (match, substitution, insertion, deletion). The edit cost of two words/sequences is the sum of the cost of each edit operation used transform one into the other. Edit Distance) The edit distance between two words/sequences is the minimal cost to transform one string into another. Generalization of the Levenshtein distance. Applications: Maximize the edit cost if higher values represent a similarity, minimize if you use an edit distance.

Answer 8

4 match+1 deletion+1 insertion+1 substitution ⟹ d = 4 * (0) + 1 * (+1) + 1 * (+1) + 1 * (+1) = 3 4 match+1 deletion+1 insertion+1 substitution ⟹ s = 4 * (1) + 1 * (-2) + 1 * (-2) + 1 * (-1) = -1

Answer 9

• Every edit operation has positive cost • for every operation, there is an inverse operation with equal cost

Answer 10

Proof: cut-and-paste argument & contradiction • Let A be an optimal alignment • Let A = A1A2A3 be a decomposition of A such that A2 is not optimal. • Let A’2 be an optimal alignment of the substrings in A2 • Substitute A2 by A’2 to build a new alignment A’ • δ(A’) = δ(A1A’2A3) = δ(A1)+δ(A’2)+δ(A3) < δ(A1)+δ(A2)+δ(A3) = δ(A1A2A3) = δ(A) • contradiction with A optimal

Answer 11

``` Case 1 (ai matches bj) cost of matching ai with bj + min cost of aligning a1…ai-1 and b1…bj-1. d (i,j) = delta(i,j) + d(i-1, j-1) ``` ``` Case 2a (deletion of ai) cost of deletion of ai + min cost of aligning a1…ai-1 and b1…bj. d (i,_) = delta(i,_) + d(i-1, j) ``` ``` Case 2b (insertion of bj) cost of insertion of bj + min cost of aligning a1…ai and b1…b-1j. d (_,j) = delta(_,j) + d(i, j-1) ```

Answer 12

``` d(i,j) = j * delta (_, *) if i = 0 i * delta (*, _) if j = 0 min of: d (i,j) = delta(i,j) + d(i-1, j-1) d (i,_) = delta(i,_) + d(i-1, j) (_,j) = delta(_,j) + d(i, j-1) otherwise ```

Answer 13

You only need to know the solutions of smaller (sub-) alignments to compute a new one. Fill the dynamic array following this partial order. We define a partial order on the (sub-)alignments Amn such that Amn <= Am'n' if m<=m' and n<=n'. See slide 20 lec 17 for this table. ``` Needleman-Wunch Algorithm for i=0 to m do .......d(i,0)=i*δ(-,-) for j=0 to n do .......d(0,j)=j*δ(-,-) for i=1 to m do .......for j=1 to n do ..............d(i,j) = min(d(i-1,j)+δ(ai,-), ..............d(i-1,j-1)+δ(ai,bj), ..............d(i,j-1)+δ(-,bj)) return d(m,n) ```

Answer 14

d - A T T G .- 0 1 2 3 4 C 1 .1 2 3 4 T 2 2 1 2 3 Backtracking • Each move is associated to one edit operation • Vertical = insertion • Diagonal = match/substitution • Horizontal = deletion • We use one of these 3 move to fill a cell of the array • From the bottom-right corner (i.e. d(m,n)), find the move that has been used to determine the value of this cell. • Apply this principle recursively.

Answer 15

Proof: • Algorithm computes edits distance. • Can trace back to extract an optimal alignment

Answer 16

Easy to compute optimal value in Omega(mn) time and Omega(m+n) space. • Compute OPT(i,⦁) from OPT(i-1,⦁). • But, no longer easy to recover optimal alignment itself, only value.

Lecture 15-17 - Dynamic Programming Flashcards

(41 cards)