Lecture 4 - String Difference Flashcards

1
Q

Applications of string distance?

A
  • biology (DNA and protein sequences)
  • file comparison (diff on Unix)
  • spelling correction, speech recognition
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is string distance?

A

What is the smallest number of basic operations needed to transform s to t?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 basic operations in transforming strings?

A

insert
delete
substitute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is it possible to have a model with costs allocated to operations?

A

Yes, our example in the lectures focuses on UNIT MODEL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the unit model for string distance?

A

Where each of the basic 3 operations costs 1 unit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do string comparison algorithms use?

A

dynamic programming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is dynamic programming used in string transformations?

A

We build up a table of solutions to sub problems that get bigger and bigger. This is caleld the TABULAR METHOD. Eventually one of the values is the optimal answer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is string distance?

A

Amount of operations one has to do to transfrom a string t to string s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is string transformation?

A

Actual process of changing string t to string s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Our dynamic programming method for string transformation and distance is what …

A

edit distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 4 cases for string transformations?

A
  • match
  • mismatch replace
  • mistmatch delete
  • mismatch insert
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the tabular method what is insert?

A

left element i.e. d(i, j-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In the tabular method what is delete

A

element on top i.e. d(i-1 , j)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the tabular method what is replace?

A

diagonal element i.e. d(i-1 , j-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do we use the tabular method?

A

As it allows overlapping problems to be skipped, henceforth making the program more efficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the space complexity of string distance / transformation?

A

O(mn)

17
Q

How can we reduce the space complexity of the tabular method.

A

2 * string t length, as we only require the row above the current row, as well as the current row itself. This is O(n)

18
Q

What is the complexity of edit distance algorithm with the tabular method.

A

O(mn)

19
Q

What row / column can be filled in even before running the algorithm in the tabular method?

A

row 0 and column 0, simply replaced by the number of the square

d(i , 0) = i
d(0 , j) = j
* essentially index

20
Q

What is a vertical step in traceback?

A

substitution or match

21
Q

What is a horizontal step in traceback?

A

insert

22
Q

What is a vertical step in traceback?

A

delete

23
Q

Is traceback unique?

A

No, it can have many optimal alignments