7: Analysis Flashcards
Analysing program complexity
Define asmptotic complexity
The degree to which an algorithm (or its subprocesses) become more complex or the nuber of required operaions becomes greater in number as the length of the input lengpth increases.
Give the notations for best, worst, and average complexities
Best case = Θ(n), exact upper bound
Average case = O(n), rough upper bound
Worst case = Ω(n), lower bound
When is an algorithm in the exponential class of time compelxity?
When the exponent in an exponential is (a) variable, e.g. length of array as input.
What is loop hoisting?
Trying to refactor program loigic to remove loops to optimise complexity/ies.
What is the lower bound on sorting comparables? Assume involves comparisons of whole comparable objects being sorted and not other properties or fields, such as sort keys.
Minimum time complexity Ω(nlog(n))
What are sort keys?
Fields in items in a list that dictate the order of the list. In a database they are fields and a type of key (e.g. primary).
What’s the proof of the lower bound for sorting?
Proof for minimum time complexity Ω(nlog(n)) for sorting algorithms:
For a list of n elements, there are n! different possible orderings of the elements.
A comparison has 2 results: which of 2 elements to put 1st. With k comparisons made to sort the list and comparisons involved in the whole algorithm, this results in 2 ^ k choices.
2 ^ k must be >= n! since we have to distinguish between all the different possible orderings, of which there are n!.
This is true when k >= log2(n!)
Via Stirling’s approximation, log2(n!) is Ω(nlog(n)) worst case.
What would P != NP actually mean?
Each NP problem would have a superpolynomial lower bound of complexity.
What are P problems? What does P stand for?
P problems have proofs showing they can be solved in polynomial time by a deterministic TM. P stands for polynomial; since no N for nondeterministic, “P” problems are deterministically (and therefore also nondeterministically) solvable within polynomial (P) time.
What are NP problems? What does NP stand for?
NP problems have no proof showing they can be solved in polynomial time by a deterministic TM but do have a proof showing they can be verified in polynomial time by a deterministic TM.
Note verification is checking against previously found results, e.g. factorials of BigIntegers in a HashMap in Java so you don’t need to recalculate each time.
NP stands for Nondeterministic Polynomial, since solvable in polynomial time only nondeterministically and not deterministically.
What are NP-hard problems?
NP-hard problems are all problems that NP problems can be reduced to in polynomial time by a deterministic TM.
What are NP-complete problems?
NP-complete problems are problems in both NP and NP-hard. They have no existing proof they can be done in polynomial time by a deterministic TM, have a proof they can be verified in polynomial time by a deterministic TM, and can be reduced to from NP problems in polynomial time by a deterministic TM.
Note verification is checking against previously found results, e.g. facotorials of BigIntegers in a HashMap in Java so you don’t need to recalculate each time.
What is verification?
Verification is checking against previously found results, e.g. facotorials of BigIntegers in a HashMap in Java so you don’t need to recalculate each time.
What do divide and conquer algorithms do?
Recursively divide problems into smaller subproblems, before solving (“conquering”) the original larger problem by combining the results of solving the subproblems.
What is a recurrence relation?
Recurrence relations are equations defining each element in sequences as a function of the preceding ones. They are used to define recursive functions.
What is the mathematical definition of a recurrence relation?
T(x) = time taken for input size x
T(n) depends on T(m) where m < n: the time taken for each problem size depends on the size 1 less (and thereby all problem sizes smaller than it).
How can you find the complexity of an algorithm from its recurrence relation?
Expand it and then apply known solutions or repeatedly guess and then verify possible complexities, usually by induction: trial and error.
How does the trial and error method of finding the complexity of an algorithm from a recurrence relation work?
It involves repeatedly guessing and then verifying possible complexities; verification usually via mathematical induction.
T(n) will = T(m) + another term, z; z depends on the nature of the algorithm, m < n, usually m = n - 1. When guessing T(n) is of a certain order of complexity, set the extra term to be that of the order ÷ n. For example, when trying O(n^2), set the term order as O(n).
You know the trial works when you can expand it into multiple recursive substeps, and the number of steps and complexities in each step corresponds to the nature of the algorithm. (? this + relevance math induction)
How do you solve the complexity of a problem from its recurrence relation by hand?
Try different n values in T(n) to try to work out a generalisation of T(n) in terms of only the base case (T(1) usually) and n - not in terms of T(m) where m < n, a usual.
You can then derive the complexity from this. You might need to use understanding of the problem, e.g. number of comparisons = k in sorting and number of decisions from them = 2 ^ k <= n!, for example substituting in k somewhere to be able to reduce a term to its complexity order only.
What is the Master Theorem for divide-and-conquer recurrences?
Provides a general form of divide-and-conquer recurrences and 3 cases for them that must hold.
What is the Master Theorem’s general form?
T(n) = a * T(n / b) + f(n)
a and b are constants such a >= 1, b > 1
What is the first Master Theorem case?
If f(n) = O(n^(logb(a-e)) for some constant e > 0, then T(n) = Θ(n ^ (logb(a)))
What is the second Master Theorem case?
If f(n) = Θ(n ^ (logb(a)) * log^k(n)) for some constant k >= 0, then T(n) = Θ(n ^ (logb(a)) * log^(k+1) of (n))
What is the third Master Theorem case?
If f(n) = Ω(n ^ logb(a+e)) for some constant e > 0, and f(n) satisfies the regularity condition, then T(n) = Θ(f(n)).
Note the regularity condition is when a problem has af * (n / b) <= c * f(n) for a sufficently large n and a constant c < 1.