Lecture 5 - BM Flashcards
What does BM stand for?
Boyer Moore (algorithm)
How does BM perform compared to KMP and brute force?
It is almost always faster than both.
Are all characters in the text checked at BM?
No, most are skipped.
In what direction is the string scanned?
right to left (reverse of the other 2 methods)
What is used to decide the next comparison?
the text character involved in the mismatch
Does BM include any preprocessing?
Yes, we need to record the position of the last occurrence of each character c in the alphabet.
While preprocessing what happens if a certain alpha character doesn’t xist in the string?
-1 is entered
How big is our BM array?
big enough to fit the whole alphabet and other character, the array is indexed by the ascii value (128)
What version of BM will we look at in this course?
Boyer Moore Horspool algorithm , it is the simplified version of BM
How big is the ASCII character set?
128 chars
What happens upon a mismatch in BM?
We have to slide s along to align last occurance with the mismatched character. If s moves in the incorrect direction we instead move s once position to the right. If occurance doesn’t exist we slide string -> align with the next on the left
How many mismatch cases is there in Bm?
3
What is case 1 in BM?
There is a mismatch and last occurence exists to the left (yet to check ). In this case:
- slide string to match last occurence with current i (this means i + (m-1) - last occurence index)
- j becomes m-1 (last character of the string as we essentially restart search from right to left)
- sp becomes sp + j - last occurence (the amount pattern has been shifted)
What is case 2 in BM?
There is a mismatch and last occurence exists to the right (been checked). In this case:
“we move string along by one place to the right and restart from the end”
- the new value of i is i + (m-1) - (j-1)
- the new value of j is m-1
- sp becomes sp+1 (as shifted by 1)
What is case 3 in BM?
When there is a mismatch and the character doesn’t appear.
- i = i + m, shift by the length of the whole string
- j = m-1 (end of string - absolute restart)
- sp = sp + j + 1