Divide and Conquer Flashcards
Q: If a recursive algorithm divides its input size by 2 at each step, follows one path, and does constant work, what’s the runtime?
A: O(log n) - Example: Binary Search
Master Theorem:
T(n) = aT(n/b) + O(n^d)
T(n) =1T(n/2) + O(n^0) # Each level is O(1)
b^d (1) = a (1), therefore O(n^d log n) using MT
Simplifies to O(log n)
Q: If a recursive algorithm looks at BOTH halves of the input and does constant work per level, what’s the runtime?
A: O(n) - Example: Binary Tree Traversal
Master Theorem:
T(n) = aT(n/b) + O(n^d)
T(n) = 2T(n/2) + O(n^0)
- subproblems: 2
- scaling: 2
- work at each level: O(1)
b^d (1) < a (2), therefore O(n^{log_b a}) using MT
Plug in: O(n^{log_2 2})
Simplifies to O(n^1) to O(n)
Q: If a recursive algorithm divides input by 4, looks at all quarters, and does O(n^2) work per level, what’s the runtime?
A: O(n^2)
Q: If a recursive algorithm divides input by 2, processes both halves, and does O(n*log(n)) work per level, what’s the runtime?
A: O(n*log^2(n))
Q: If a recursive algorithm divides input by 3, processes all thirds, and does O(sqrt(n)) work per level, what’s the runtime?
A: O(n^{log_3(3)}) = O(n)
Q: If a recursive algorithm divides input by 2, processes both halves, and does O(n^2) work per level, what’s the runtime?
A: O(n^2)
- T(n) = 2T(n/2) + O(n^2)
- a=2, b=2, d=2, since a<b^d, we get O(n^d)
Q: If a recursive algorithm divides input into 3 equal parts, processes all parts, and does O(1) work per level, what’s the runtime?
A: O(n^{log_3(3)}) = O(n)
Q: If a recursive algorithm divides input by 2, looks at BOTH halves, and does O(n) work per level, what’s the runtime?
A: O(n log n) - Example: Mergesort
Q: When you see T(n) = 1T(n/2) + O(n^0), what runtime should you expect?
A: O(log n)
Why? Dividing by 2 each time with constant work per level
Master theorem:
b^d = 2^0 = 1 = a
if b^d = a, T(n) = O(n^d log n)
(simplifies to O(log n) since d is zero)
Q: When you see T(n) = 2T(n/2) + O(n), what runtime should you expect?
A: O(n log n)
Why? Linear work at each level, with log n levels
Master theorem:
b^d = 2 = a, so
T(n) = O(n log n)
Q: When you see T(n) = 2T(n/2) + O(1), what runtime should you expect?
A: O(n)
Why? Even though dividing by 2, visiting both halves means visiting all n elements once
Master’s Theorem:
b^d = 2^0 = 1
2 (a) > 1 (b^d)
if a > b^d: O(n^{log_b a})
T(n) = O(n^{log_2 2})
= O(n^1)
= O(n)
Q: If a recursive algorithm divides input into 3 parts and does constant work, what’s the runtime?
A: O(n)
T(n) = 3T(n/3) + O(1)
a > b^d
O(n^(log_b a)) = O(n^(log_3 3)) = O(n)
Given O(3^(log_2 n)) , solve for n
(Hint: just need to swap 2 things)
O(n^(log_2 3))
Idea: Switch 3 and n
[Recurrence relation]
For substitution based recurrences of form T(n) = aT(n-b) + c, what does it typically result in?
O(a^n)
What is runtime of:
T(n) = 2T(n-1) + c
O(2ⁿ)
What is runtime of:
3T(n-1)
O(3^n)
What is runtime of:
2T(n-2) + c
O(2^{n/2})
We subtract 2 each time (not 1)
We need to reach the base case T(0) or T(1)
So if we subtract 2k, we need: n-2k = 0
Solving for k: k = n/2
[Recurrence Relation]
What is runtime of below with no merge work:
T(n) = T(n/2)
O(1)
T(n/2)
= O(n^{log_2 1})
= O(n^0)
= O(1)
[FastSelect] Explain why select v in list A between 25th and 75th percentile will ensure sublists A_L and A_R to have size at worst 3/4 of A
- A number at the 75th percentile has 75% of elements less than or equal to it, and 25% greater. A_L has at most 75% of elements
- A number at the 25th percentile has 25% of elements less than or equal to it, and 75% greater A_R has at most 75% of elements.
- Therefore, no matter where v falls in this range, neither S_L nor S_R can be larger than 75% of the original array
[FastSelect]
Proof of good pivot
pivot p is good if?
|A < p| <= 3/4n
|A > p| <= 3/4n
[FastSelect]
Runtime analysis
Breaking A in n/5 groups: ?
Sorting each 5-element group: ?
Call FastSelect(S, n/10) to get pivot: ?
Partitioning A (k<=|A<p|, k>|A<p| + |A=p|, p): ?
Breaking A in n/5 groups: O(n)
Sorting each 5-element group: O(n) (O(1) * n groups)
Call FastSelect(S, n/10) to get pivot: T(n/5)
Partitioning A (k<=|A<p|, k>|A<p| + |A=p|, p): ? T(3n/4) if good pivot
T(n) = T(3n/4) + T(n/5) + O(n)
= O(n)
[Masters Theorem]
What does a^{log_b n (or n^{log_b a}) represent in a D&C problem of size n?
Width of the last level (base case of 1), ie. a^{log_b n}
Master’s Theorem formula:
if a > b^d, what is T(n)?
O(n^{log_b a})
Master’s Theorem formula:
if a = b^d, what is T(n)?
O(n^d log(n))
Master’s Theorem formula:
if a < b^d, what is T(n)?
O(n^d)
[Masters Theorem]
What does log_b n represent in a D&C problem of size n?
The depth of the subproblems
What is the depth of subproblems asking if it’s in log(n)?
“How many times do I need to divide subproblems by 2 to get to the base case of 1?
Observe the substitution method when solving runtime of Fast Integer Mult without Master Theorem:
$$
T(n) <= cn (1 + (3/2) + (3/2)^2 + … + (3/2)^{i-1}) + 3^i T(n/2^i)
$$
When solving for base case, we let i = log_2(n).
Q: Why do we use base of 2 for this problem?
The base of i comes from the scaling factor of the recurrence relation, which is n/2
Manipulate this polynomial for n:
4^{log_2 n}
n^2
Manipulate this polynomial for n:
3^{log_2 n}
n^{log_2 3}
Geometric Series:
Given a geo-series:
\sum_{j=0}^k a^j = 1 + a + a^2 + … + a^k
If a>1, what is Big-O?
O(a^k)
(last term dominates)
Geometric Series:
Given a geo-series:
\sum_{j=0}^k a^j = 1 + a + a^2 + … + a^k
If a=1, what is Big-O?
O(k)
(each term is exactly 1, number of terms is O(k+1) which is O(k)
Geometric Series:
Given a geo-series:
\sum_{j=0}^k a^j = 1 + a + a^2 + … + a^k
If a<1, what is Big-O?
O(1)
(each term is decreasing, so dominant term is O(1))
[DPV 2.16 - Infinite Array]
Algorithm
1) Start with index = 1 and double index on each iteration, continuing until we reach A[index] >= x. This gives us upper bound of array.
2) Perform black box binary search to find x from A[1] to A[index]. Return result of the search.
[DPV 2.16 - Infinite Array]
Justification of Correctness
1) We can use exponential search and binary search because array is sorted.
2) Doubling the index at each step allows us to efficiently find upper bound and is guaranteed to terminate at either A[index] > x or we hit infinity
3) Once array is bounded, we can use binary search on it.
[DPV 2.16 - Infinite Array]
Runtime analysis
1) Exponential search is O(log n). Doubling index value at each iteration, we explore all A[2^i] where 0 <= i <= log(2n). This results in O(log n) since highest upper bound is 2n, only in case where x > A[n].
2) Binary search is O(log n)
O(log n) + O(log n) = O(log n)
[DPV 2.23b - Majority Element]
Algorithm
1) Pair up elements of A arbitrarily to get n/2 pairs.
2) If elements of a pair are NOT equal, discard it. If they are, keep just one of them. Continue until 1 or 2 elements are left.
a) If there is an odd number of elements, take one element and check if it’s the majority by comparing it against all other elements at the current level.
b) If it is, return it
c) If not, discard it.
4) Check remaining element against all elements in the original A to see if either is majority. Return this element if majority exists.
[DPV 2.23b - Majority Element]
Justification of Correctness
1) If there is a majority x, then at least n/2 + 1 of x exists.
2) After every round of elimination, x would remain majority bc it can create more pairs than any other elements combined.
3) For each x not paired with an x, a non x also doesn’t get paired with itself. Thus, x pairs always outnumber any other pair.
4) If there are odd numbers of elements, we can eliminate it by checking to see if it’s a majority element.
5) We check remaining elements against original array to validate that it is a majority.
[DPV 2.23b - Majority Element]
Runtime analysis
Recurrence relation is T(n) = T(n/2) + O(n)
1) T(n/2) since we split problem in half each time by pairing. O(n) since we do comparisons at each recurrence.
2) a < b^d, so we get O(n) using Master Theorem.
3) Majority check is O(2n) at most
4) Results in O(3n) which is O(n)
[Binary Search Modified]
“Design an O(log(n)) algorithm to find the smallest missing natural number in a given sorted array of length n. The given array only has distinct natural numbers. For example, the smallest missing natural number from A = [3, 4, 5] is 1 and from A = [1, 3, 4, 6] is 2.”
Algorithm
Use modified binary search.
1) Select midpoint index of A (ie. mid) and check if A[mid] is equal to mid.
a) If it is, recurse to the right
b) If not, recurse to the left
2) Base case: When 1 element left, check if element at A[i] equals i.
a) If it is, then return i+i
b) If not, the missing value is i so return it.
[Binary Search Modified]
“Design an O(log(n)) algorithm to find the smallest missing natural number in a given sorted array of length n. The given array only has distinct natural numbers. For example, the smallest missing natural number from A = [3, 4, 5] is 1 and from A = [1, 3, 4, 6] is 2.”
Justification of Correctness
1) Input is sorted so we can use binary search.
2) The array only has natural numbers, so smallest missing number is the lowest i such that A[i] != i.
3) When we check A[mid] = mid, we know that if they match all previous values in the array must not have gaps, so we must search to the right. If they don’t match, then we know the smallest missing number must lie in the range A[1…mid-1].
[Binary Search Modified]
“Design an O(log(n)) algorithm to find the smallest missing natural number in a given sorted array of length n. The given array only has distinct natural numbers. For example, the smallest missing natural number from A = [3, 4, 5] is 1 and from A = [1, 3, 4, 6] is 2.”
Runtime Analysis
The only modification is the logic for which to recurse. This comparison is O(1), thus the algorithm remains O(log n).
[HW: Local Maxima]
Algorithm
1) Select middle column index (j) and scan this column to find row index with maximum value (i).
2) At this position, check its left and right neighbors to determine if it is a local maxima.
a) If left is greater, recurse to the left with columns 1…j-1
b) If right is greater, recurse to the right with columns j+1 … n
3) When recursion reaches a point where the left and right column index is equal, find row index which contains max value and return this position.
[HW: Local Maxima]
Justification of Correctness
1) Each column scan for a max guarantees a single position since all values in the matrix are distinct
2) By recursing towards increasing and distinct values, we are guaranteed to recurse towards a search space containing at least one of those peaks in the values or eventually reach the edge
[HW: Local Maxima]
Runtime analysis
Recurrence relation is T(n + m/2) + O(n) where n is num of rows and m is num of columns. We get m/2 since we halve the size of columns at each recursive step while the amount of rows remain the same. Checking for the max in a column at each step has cost of O(n). Thus combined, we get O(n log n).
[DPV 2.17 Fixed Point]
“Given a sorted array of distinct integers A[1, …, n], you want to find out whether there is an index i for which A[i] = i. Give a divde-and-conquer algorithm that runs in time O(log n)”
Algorithm
1) Find middle index of array (mid)
2) Modification to BS comparison: Check if array value at mid is equal to the middle index, ie. A[mid] = mid.
a) If True, return True
b) If False, recurse left A[1…mid]
3) Base case: Single value left, if value equal to index return True, else False.
[DPV 2.17 Fixed Point]
“Given a sorted array of distinct integers A[1, …, n], you want to find out whether there is an index i for which A[i] = i. Give a divde-and-conquer algorithm that runs in time O(log n)”
Correctness
1) Sorted array in non-decreasing order, so we can use binary search
2) If an array contains an index i for which A[i], all values to its left must be equal to its index. If not, values to the right must be unequal to its index. Therefore, we must recurse left to search for possible A[i] = i cases.
4) By checking midpoint, we halve search space at each recursion level.
[DPV 2.17 Fixed Point]
“Given a sorted array of distinct integers A[1, …, n], you want to find out whether there is an index i for which A[i] = i. Give a divde-and-conquer algorithm that runs in time O(log n)”
Runtime
O(log n)
- Only modification to binary search is comparison and final return logic. Both are constant time.
[DPV 2.18 - Proving Binary Search]
Q: Why is Ω(log n) the lower bound for comparison-based searching in a sorted array?
1) In a comparison-based search on n elements, each comparison can at best eliminate half the remaining possibilities.
2) Starting with n possibilities and needing to reach 1 unique answer requires at least log₂(n) comparisons, forming a decision tree with height at least log₂(n).
3) No comparison-based algorithm can do better, making Ω(log n) the theoretical lower bound.
Is this modified MergeSort recurrence faster than the original?
T(n) = T(3n/4) + T(n/4) + O(n)
The recurrence T(n) = T(3n/4) + T(n/4) + O(n) is worse than O(n log n)
- The splits add up to n (3n/4 + n/4 = n)
- But uneven splits lead to unbalanced recursion tree
- Longer path takes log(4/3) n levels through 3n/4 branches
- Makes runtime O(n log(4/3) n), which is worse than standard MergeSort
[Formatting]
Valid modification to Binary Search (3)
- Finding the leftmost matching value.
- Finding the rightmost matching value.
- Finding the insertion point for a missing matching value.
[Formatting]
Validation modification to MergeSort (1)
- Returning the list in descending order.
[Formatting]
Valid modification to FastSelect (1)
- Return the kth largest element.
[MergeSort]
Why does MergeSort eliminate redundant work? (2)
- Never re-comparing elements within already sorted subarrays
- Taking advantage of the sorted property during merging to avoid unnecessary comparisons