H2022 Algo Flashcards
abcd
Prefixes of length 4:
A prefix is a continuous segment starting from the beginning of the string.
There is only 1 prefix of length 4.
Substrings of length 4:
A substring is any continuous segment of the string.
There are 10−4+1=7 substrings of length 4.
Example: For “abcdefghij”, the substrings of length 4 are:
“abcd”, “bcde”, “cdef”, “defg”, “efgh”, “fghi”, “ghij”.
Subsequences of length 4:
A subsequence is any selection of characters in order without needing to be continuous.
The number of subsequences of length 4 from a string of length 10 is given by the combination formula:
n n!
= ———-
k k!(n-k)!
say we have:
n = 10
(a,5), (b,1), (c,1), (d,1), (e,1), (f,1)
we get:
(b,1) + (c,1) -> (bc,2)
(d,1) + (e,1) -> (de,2)
We can choose to pair either with (f,1)
(bc,2) + (f,1) -> (bcf,3)
now combine the two lowest
(bcf,3) + (de,2) -> (bcfde,5)
now we combine (a,5) with the tree
(a,5) + (bcfde,5) -> (abcdef,10)
a would have exactly 1 in length of the encoding!
What is the running time in the worst case for the Knuth-Morris-Pratt algorithm when n is the length of the text and m is the length of the pattern?
A: O (m + n)
B: O (mn)
C: O (n^2)
A: O (m + n)
The smallest ratio is 1/3 =0.33, S3
B: a & c
Remember to check both paths!!!!! Cuz it might just be violated on one side….. #Cry
Given the following set of strings S = {aaba, aac, abaa, abc}.
If we add the string “aab” to S, we will have a problem. What is the problem and how can we solve it?
The problem is that the string “aab” is a prefix of a string already in the trie (“aaba”). We can solve it by adding a special characher not used in the strings (for example $) at the end of the string.
What is the main idea behind Compressed tries? Redraw the trie as a compressed trie.
Standard tries are space inefficient. This is solved by ensuring that each internal node has at least two children. We use substrings to label nodes in stead of single characters.
Give an example of a problem where tries are useful.
Examples are wordmatching and dictionary. (In the pattern matching problem, we are looking for substrings in stead of whole words).
What is a suffix trie?
Construct a small example.
Explain why and how it can be used in pattern matching.
Example 1:
Pair 1: (t1 = 5, w1 = 10)
Pair 2: (t2 = 6, w2 = 8)
t1 <= t2 (5 <=6) and
w1 >= w2 (10 >= 8)
Thus (5,10) dominates (6,8) because for a smaller size (t1 ), it provides a higher value (w1).
What is the maximum number of elements in A (i) expressed in terms of B?
B = 11
B+1