Algorithms n stuff Flashcards
How does the Brute Foce pattern matching work?
It works by systematically checking every possible alignment of the pattern against the text.
1. Align the pattern with the beginning of the text.
2. Compare each character of the pattern with the corresponding character in the text from left to right
3. Repeat this process until:
* A match is found, or
* All possible alignments have been checked (i.e., until the pattern has been aligned with the last possible position in the text).
Time Complexity: O(m⋅n), where:
m is the length of the pattern.
n is the length of the text.
How does the Boyer-More pattern matching work?
T = “abcaccabc”
P = “abb”
The Boyer-Moore algorithm, comparisons are done right-to-left within the pattern P
T = a(0) b(1) c(2) a(3) c(4) c(5) a(6) b(7) c(8)
P = a(0) b(1) b(2)
first comparison:
P[2] = ‘b’
T[2] = ‘c’
Mismatch, and we don’t have any c’s in our pattern, move P past c(2)
Second comparsion:
P[2] = ‘b’
T[5] = ‘c’
Mismatch, and we don’t have any c’s in our pattern, move P past c(5)
Third comparison:
P[2] = ‘b’
T[8] = ‘c’
Mismatch, and we don’t have any c’s in our pattern and we’re at the end of T.
How does the Knuth-Morris-Pratt algorithm work?
The Knuth-Morris-Pratt (KMP) algorithm is a string-matching algorithm that avoids redundant comparisons by preprocessing the pattern to identify recurring sub-patterns. This allows the algorithm to shift the pattern efficiently without re-checking characters that have already matched.
Step 1: Preprocessing the Pattern
Construct a partial match table (or Failure function) for the pattern.
The table indicates the length of the longest prefix of the pattern that is also a suffix for each position in the pattern.
Step 2: Searching the Text
Compare the pattern with the text from left to right.
If there is a mismatch:
Use the prefix table to determine the next position in the pattern to continue the comparison, rather than starting from the beginning of the pattern.
If a match is found, continue searching for the next occurrence.
runs in 𝑂(𝑛)
How does the Suffix trie work?
› The strings in the collection S are all suffixes of a string X.
› Can simplify representation such that the label of a node is a pair (i, j) indicating X[i..j]
› Add a special character $ that is not in the original alphabet to satisfy that a suffix is not a prefix of another suffix
Searching for a pattern is O(m)
How does the Huffman algorithm work?
› Construct a binary tree
› Each node except the root represent a bit in a code word
› Left child represent 0
› Right child represent 1
› Each external node v is associated with a specific character
› Let each of the d characters be a binary tree
› While there are more than one tree
› Merge the two trees with smallest frequencies
› Decoding
› Process the string from left to right. Start in the root, go left if there is a 0 and right if there is a 1
› When you end up in a leaf (external node), write down the corresponding character.
› Start again in the root
Example:
1010001000 - hei
How does the Travelling Salesperson problem (TSP)
work?
The Traveling Salesperson Problem (TSP) is a classic optimization problem in computer science and mathematics. It involves finding the shortest possible route that visits each city exactly once and returns to the starting city.
Given:
A set of n cities.
A distance (or cost) matrix that gives the distance between every pair of cities.
Objective:
Find the shortest route (or path) that visits each city exactly once and returns to the starting city.
How does the Double Tree algorithm work?
The Double Tree Algorithm is a simple approximation algorithm for solving the Traveling Salesperson Problem (TSP). It guarantees a solution with a cost at most 2 times the optimal solution. This is achieved by constructing a tour based on a Minimum Spanning Tree (MST) of the graph.
* Assume that the triangle inequality holds
* Find MST
* Double all the edges in the tree (then all nodes will have even degree)
* Find Eulerian tour (exist when all nodes have even degree. Easy to
find)
* Consider tour. Skip nodes that are already visited. Because of triangle
inequality cost can not increase
1,6,2,3,6,5,4,6,1
1,6,2,3,5,4,1
How does the Christofides’s algorithm work?
- Find a MST
- Find a minimum perfect matching among the nodes with odd degree
in MST. Add these edges to the MST (red lines in pic) - Find a Eulerian tour in the resulting graph
- Consider the tour. Skip already visited vertices
Christofides’ algorithm for the metric TSP is a 3/2-
approximation algorithm
We start at 1.
1,3,4,6,5,4,2,1
Final: 1, 3,4,6,5,2,1
What is the difference between the Double Tree and Christofides algorithm?
The key difference lies in how the graph is made Eulerian:
Double Tree doubles all edges
Christofides connects odd-degree nodes efficiently.
This small change results in a significant improvement in the approximation ratio!
Approximation Quality:
Double Tree: ≤2× optimal.
Christofides: ≤1.5× optimal (better due to the minimum matching step).
Knapsack
- Set of items 𝐼 = 1,2, … , 𝑛
- The knapsack has a capacity 𝐵 > 0
- Each item 𝑖 has a value 𝑣𝑖 ≥ 0 and a size 0 ≤ 𝑠𝑖 ≤ 𝐵
- Goal
Find a subset of the items 𝑆 ⊆ 𝐼 that maximizes σ𝑖∈𝑆 𝑣𝑖 where σ𝑖∈𝑆
𝑠𝑖 ≤ B
How do Approximation algorithms work?
- Relax the requirement of finding an optimal solution
- Try to find a solution that closely approximates the optimal solution in
term of its value - Objective function maps each possible solution to some nonnegative
value - Optimal solution is one that either minimizes or maximizes the value
of the objective function
Performance guarantee
* Our book
* 𝛼 > 1 for minimization problems
* 𝛼 < 1 for maximization problems
* Example
* An 1
2
-approximation algorithm for a maximization problems is a polynomialtime algorithm that always returns a value whose value is at least half the
optimal value
Given the instance (we have just shown just the beginning)
I = {0.68, 0.67, 0.66, 0.65, 0.63, 0.63, 0.60, 0.60, 0.59, 0.58, 0.57, 0.55, 0.48. 0.47, 0.45,…}. Show the start of the corresponding rounded instance I’ obtained by linear grouping. Use k = 3 (this is very small for practical problems since we have 𝑘 = ⌊𝜀𝑆𝐼𝑍𝐸(𝐼)⌋).
𝐼′ = {0.65, 0.65, 0.65, 0,60, 0.60, 0.60, 0.58, 0.58, 0.58,0.48, 0.48, 0.48, … }
Reason:
* Group the pieces of the given input I as follows: The first group
consists of the 𝑘 largest pieces, the next group consists of the next
the 𝑘 largest pieces and so on. The last group consists of ℎ pieces,
where ℎ ≤ 𝑘.
* The rounded instance I’ is constructed by discarding the first group
and for each other group rounding its size of the pieces up to the size
of the largest piece.
Sceduling problem * What are the steps in Minimizing the Sum of Completion Times on a single machine
- n jobs, each with a release date 𝑟𝑗 ≥ 0 and a processing time 𝑝𝑗 > 0 (both
integers) - Process the jobs on one machine
- No job is processed before its release date
- At most one job is processed at a point of time
- The jobs are processed nonpreemptly
- Let 𝐶𝑗 denote the finishing time for job j.
Sceduling problem * What are the steps in Minimizing the Sum of Completion Times on a single machine Preemtive Version
- Do not need to complete a job before we start another job
- Can find an optimal solution to this problem via the shortest
remaining processing time (SRPT) rule - while there are jobs left do
- If there are no released jobs, move time until next job is released
- Among released jobs that are not done, choose the one with
shortest remaining processing time - Schedule it until it is completed or a new job is released
How can an optimal nonpreemptive schedule be derived from an optimal preemptive schedule?
An optimal nonpreemptive schedule
is feasible for the preemptive scheduling problem.
- Find an optimal preemptive schedule using SRPT
- Schedule the jobs nonpreemptly in the same order that they
complete in the preemptive schedule