Algorithm concepts v2 Flashcards
simple editor
Recurrence
aT(N/b) + D(N) + C(N) , N ≥ c
T(N/b) is the total time to solve a subproblem
Lomuto partitioning
compare all elements to last element in array
Hoare partitioning
partition grows inward from the end
handles duplicates better
Three-way partitioning
includes a region for values equal to pivot
handles duplicates better
Median of 3 partition
choose pivot as median of three random elements
better balance between subproblems
Amortisation
amortized analysis considers a sequence of operations
cost of expensive operations is ‘amortized’ across the sequence…
can never “be in debt”
pick a value to ‘pay’ per operation - show that this value covers all costs (never in debt)
Amortized time
average time taken per operation to do many operations
worst-case run time per operation rather than per algorithm can be too pessimistic
Amortized analysis
considers both the costly and less costly operations together over the whole series of operations of the algorithm
Amortized analysis: aggregate analysis
determine the upper bound T(n) on the total cost of a sequence of n operations, then calculates amortized cost to be T(n) / n
Amortized analysis: accounting method
a form of aggregate analysis that assigns to each operation an amortized cost, which may differ from actual cost
early operations have an amortized cost higher than their actual cost, which accumulates a saved “credit” that pays for later operations having an amortized cost lower than their actual cost
because the credit begins at 0, the actual cost of a sequence of operations equals the amortized cost minus the accumulated credit… the amortized cost is an upper bound on the actual cost
usually many short-running operations accumulate such a credit in small increments, while rare long-running operations decrease it drastically
Arithmetic series
1 + 2 + … + (N - 1) + N
create a second series that is a reverse of the first and add together term by term
1 + 2 + ... + (N - 1) + N \+ N + (N - 1) + ... + 2 + 1 ---------------------------------------- =(N + 1) + (N + 1) + ... + (N + 1) + (N + 1) = N(N+1) = 2s
2s = N(N +1) s = N(N + 1)/2
a + (a + d) + (a + 2d) + … + (a + (k - 1)d)
Sk = a1 + ... + ak Sk = k(a1 + ak)/2
Geometric series
𝑁𝑐+(𝑁/2)𝑐+(𝑁/4)𝑐+⋯+2𝑐+𝑐
create second series by multiplying by r (common ratio)
r = 1/2
Nc + (N/2)c + … + 2c + c
(N/2)c + … + 2c + c + c/2
Nc + (N/2)c + … + 2c + c
(N/2)c + … + 2c + c + c/2
all but two of the terms are identical
take the difference
Nc + (N/2)c + … + 2c + c
- (N/2)c + … + 2c + c + c/2
———————————————-
Nc - c/2
Nc - c/2 = s - s/2
Nc - c/2 = s/2
2Nc - c = s
𝑆𝑘 =𝑎+𝑎𝑟+𝑎𝑟^2 +⋯+𝑎𝑟^(𝑘−1)
𝑆𝑘 − 𝑟𝑆𝑘 = 𝑎 − 𝑎𝑟^𝑘
𝑆𝑘=𝑎(1−𝑟^𝑘) / (1−𝑟)
final form
𝑆𝑘=𝑎(𝑟𝑘−1) / (𝑟−1)
𝑆𝑘 = 𝑟𝑎𝑘 − 𝑎1 / (𝑟−1)
Arithmetic Series general form
Sk = a + (a + d) + (a + 2d) + … + (a + (k - 1)d)
so
Sk = a1 + … + ak
Sk = k(a1 + ak)/2
Geometric series general form
𝑆𝑘 =𝑎+𝑎𝑟+𝑎𝑟^2 +⋯+𝑎𝑟^(𝑘−1)
so
𝑆𝑘=𝑎(𝑟𝑘−1) / (𝑟−1)
𝑆𝑘 = 𝑟𝑎𝑘 − 𝑎1 / (𝑟−1)
RBT 5 properties
- all nodes (including nil) are either red or black
- the root node is black
- every leaf (all nil) is black
- both children of a red node are black
- all paths from a node to a descendant leaf contain the same number of black nodes
Simple Uniform Hashing Assumption
with reference to chaining
Given a hash table T with m buckets, using hash function h, the simple uniform hashing assumption (SUHA) states that each different key k is equally likely to hash into any of the buckets. So the probability that h(k) = I, for all 1 ≤ I ≤ m, is 1/m
most probable / expected outcome: same number in each bucket
Memoisation
making a note for later
for dynamic programming
Dynamic programming
makes a space-time tradeoff
conditions to apply it:
- optimal substructure
- overlapping subproblems
optimal substructure:
problem can be decomposed into subproblems
an optimal solution uses optimal solutions to the subproblems
overlapping subproblems
same problems are generated over and over
subproblems must still be independent
set of all subproblems is the subproblem space
subproblem space determines time complexity
Top Down
set out to solve the biggest problem
Bottom Up
complete the table in order
Optimal substructure
problem can be decomposed into subproblems
an optimal solution uses optimal solutions to the subproblems
Overlapping subproblems
same problems are generated over and over
subproblems must still be independent
set of all subproblems is the subproblem space
subproblem space determines time complexity
Graph
A graph G is a pair (V, E) where V is a finite set (of objects) and E is a binary relation on V. Elements of V are called vertices and elements of E are called edges
Breadth-first Search
search proceeds gradually down multiple paths at once
all vertices at depth I visited before any at depth I + 1
BFS maintains a set of vertices being searched
search from given s
use of FIFO queue characteristic of BFS
best-first search would use priority queue
BFS shortest paths
shortest path to all vertices
record extra info during search when a vertex is visited
unreachable vertices have diet[v] = - 1
Depth-first search
when vertex is visited, immediately use it to look for more vertices. only visit others found with v when this search is over
can use call stack instead of explicit queue
restart until whole graph searched
Kruskal’s algorithm
set of edges is iterated over in weight order
if the next edge connects two distinct components, it is added