Algorithms and Data Structures Flashcards
Name three desirable properties of a good algorithm.
Algorithm should be:
- Correct
- Efficient
- Easy to implement
These goals might not be and often are not simultanously achievable.
How is mergesort algorithm implemented?
- Based on divide-and-conquer approach.
- Breaks array into two halves and sorts both halves
- Once subproblem is 2 element array it starts merging back arrays so that they are sorted during merge. After bubbling up the recursion array is sorted.
- Needs an additional temporary array for merging.
How is quicksort algorithm implemented?
- Divide and conquer approach through partitioning.
- Taking an element and partitioning the array to “lower than” and “greater than”.
- Recursively repeating partitioning on subarrays
- When subarrays have 1 element, sorting is done.
On which algorithm is shell sort based?
- It is based on insertion sort algorithm.
How is shell sort algorithm implemented?
- It uses so cold h-sort approach to introduce partial order in the array.
- Value h is first calculated “while (h < N/3) h = 3*h + 1”
- In every loop value h = h/3 - at the end we have h = 1 and that is regular insertion on partially sorted array.
- For wrongly set h can be slow
Running time properties of mergesort algorithm?
- Mergesort is optimal compare-based sorting algorithm.
- Average, worst and best case run in O(NlogN) complexity
- Needs auxilliary array. Additional space O(N)
- For already sorted is O(N) if there is a check for if array is already sorted.
Running time properties of quicksort algorithm?
- Average O(NlogN)
- Worst O(N2) when array is already sorted. Randomization before sorting is done to prevent.
- Can be improved by cutoff to insertion sort for smaller arrays
- For arrays with lot of duplicates can be significantly improved by partitioning to “smaller than”, “equal”, “larger than”.
Ways to improve default quicksort approach?
- Cutoff to insertion sort for 15 element subarrays
- Picking the partition element. Median of three.
- Partitioning to “lower than”, “equal”, “greather than” for arrays with repeateable keys.
What are optimizations for the mergesort algorithm?
- Cutoff to insertion sort
- Check for already sorted arrays.
What are run-time properties of selection sort algorithm?
- Average, best, worst time: O(N2)
- Approx N2/2 compares and N swaps
- Doesn’t need additional memory
- Running time is not sensitive to input
What are running time properties of insertion sort algorithm?
- Average case is O(N2) with N2 swaps for unsorted array
- Worst case is O(N2) with N2 swaps for random array
- Best case is O(N) with 1 swap for already ordered array.
- It doesn’t need additional memory
- Dependent on properties of input
What are running time properties of shell sort algorithm?
- It is in practice O(N4/3), O(N5/4) … but no proof is given
- In practice used due to its simplicity and acceptable speed for moderately large arrays.
Define:
heuristic
Heuristic is a technique designed for solving a problem more quickly when classic methods are too slow, or for finding an approximate solution when classic methods fail to find any exact solution. In a way, it can be considered a shortcut.
In general, every type of problem will have its own set of heuritics.
Give one example of heuristics?
An example of heuristics is using Manhattan distance heuristic function in the problem of finding the shortest path in order to decide on the next step among the list of options.
What is the difference between the algorithmic problem and an instance of an algorithmic problem.
Algorithmic problem is specified by describing the complete set of instances it must work for and produce valid solution to all of them.
Sorting is an algorithmic problem. Sorting of an array of integers is an instance of sorting problem.
This is important since recognizing the general problem in an instance is the first step to solving it.
Fundamental difference between algorithm and heuristic?
There is a fundamental difference between algorithms, which always produce correct result, and heuristics, which may usually do a good job but without providing any guarantee.
Give one example where
heuristic yields very unoptimal solution.
If points we are trying connect are lying on a flat line and we start in the middle then it will, instead of goinglinearly, it will jump towards the outside which is very unoptima.
In general, if distances between nearest neighbours become bigger and bigger towards the end of execution, it is most probably unoptimal
Which are three forms of algorithmic notation?
Three forms of algorithmic notation (going from informal to formal):
- English
- Pseudo code
- Implementation in a particular programming language.
Common mistake is to use more formal structure to express not so formal level of communication. E.g. english explanation presented through pseudo code structure.
Which are two parts of algorithmic problem specification?
Two parts of algorithmic problem specification are:
- Set of allowed input instances.
- Required properties of the algorithm output.
What can be done with set of allowable instances in the specification of an algorithmic problem in order to make it easier to find a correct solution?
Narrowing down problems to a more simple allowable instances is a standard way to simplify reasoning about algorithms.
Example would be reasoning about a tree structure instead of full graph or single dimensional problem instead of 2-dimensional one.
What is the best way to prove one algorithm incorrectness?
The best way to prove incorrectness of an algorithm is by finding a counter-example for which algorithm is not correct.
Counter example is the case of input sequence for which correct solution exists but algorithm doesn’t find it.
Which are two important properties of counter-examples?
Two properties of a good counter-example are:
- Verifiability
- Simplicity
Which are good approaches in finding counter-examples?
- Think small.
- Think exhaustively
- Hunt for the weakness
- Go for a tie - when algorithm has to decide on two options.
- Seek extremes
Inductive proof steps are?
- Prove the idea for the basic case n = 1, n = 2
- Assume the idea is valid until the “n - 1” case
- Prove that it is still valid for the “n” case.
Inductive reasoning is important in algorithmic theory because …
Inductive reasoning comes natural with the recursive nature of lot of algorithmic problems.
Two major classes of summations are?
Two major classes of summation formulas are:
- Arithmetic progression
- Geometric progression
Explain:
Arithmetic progression
Arithmetic progression of summation is the case where difference between consecutive terms is constant.
In general for thr p-th degree summation it yields (p+1)-th degree sum.
S(n,p) = SUMi -> n(ip) = Theta(np+1)
Concrete
SUMi -> n(i) = n(n+1)/2 ~ Theta(n2)
Explain:
Geometric progression
Geometric sequence of numbers if such a sequence where every next term is found by multiplying the previous one by a fixed non-zero multiplier called common ratio.
An example:
G(n, a) = SUMi = 0 -> n(ai) = a(an+1 - 1) / (a - 1)
In general
G(n, a) ~ Theta(an+1), a > 1
Explain:
Modeling of an (algorithmic) solution.
Modeling is the process of formulating the solution of a problem in terms of precisely described well-understood and already resolved problems and solutions.
Proper modeling can hugely reduce the need for finding complex solutions by reusing existing solutions.
It is important to understand that modeling often has to happen on the abstract level of discussion about general structures and solutions which are not bound to any domain.
Modeling with permutations.
Permutations - arrangements or ordering of items.
For example: {1,2,3,4} and {2,3,1,4} are two permutations of the same set.
Usually, problem modeled by permutations is seeking for ordering, sequence, tour, arrangement
Modeling with subsets.
Subset - represent selection from a set of items.
For example: {1,2,4} and {3} are two distinc subsets of {1,2,3,4}
Usually, subsets are modeling the problem which seeks for cluster, collection, group, packaging, committee
Modeling with trees.
Trees represent hierarchical relations between items.
For example tree would be used to model family tree.
Problems with solutions modeled as trees usually are seeking for hierarchy, dominance relationship, taxonomy, ancestor/descendant relationship
Modeling with graphs.
Graphs are representing arbitrary relationship between pairs of items.
For example: road network in a city is modeled by graphs.
Usually, graphs are model of a solution when we seek for network, circuit, web, relationship
Modeling with points.
Points are representing location in some geometric space.
For example: list of monuments you want to visit in a city.
Usually they appear in the solution whenever we seek for instance sites, locations, positions
Modeling with polygons.
Polygons are representing some physical areas.
For example: finding neighbouring city borders.
They appear usually when we seek for shapes, regions, configurations, boundaries.
Modeling with strings.
Strings represent sequence of characters or patterns.
For example: Finding all names that start with “Iv”
Usually they appear when we are expected to find text, characters, patters, labels.
Explain recursive nature of standard model objects.
In general, removing an element from the set of elements of the model will produce a smaller object which will represent a smaller problem than the original full one.
Explain:
Knapsack problem
Imagine a tief with a knapsack breaks into a store. How would tief pick the items from the store to fill the knapsack so that it is full. Every item has size and the knapsack has a total capacity.
Given the set of numbers, find the subset that adds up to a certain target.
For example: items are of sizes S= {1, 2, 5, 9, 10} and the knapsack is of target capacity T=22.
Two most important tools for analysis of algorithm efficiency without actually implementing them are?
- RAM model of computation.
- Asymptotic analysis of worst-time complexity.
It is important to understand that target of these models is to analyze algorithm in a machine independent way. They are both approximations of the real machine but excellent for understanding overall properties of algorithms.
Which are properties of RAM (Random Access Machine) model of computation?
- Every simple operation (+, -, *, =, /, if, call) takes one step.
- Loops and subroutines are not simple operations.
- Reading and writing of memory takes one step.
Define:
Worst-case complexity.
The worst-case complexity of an algorithm is a function defined by the maximum number of steps algorithm takes for any input instance of size n.
For example: Sorting of an input array may be very different for different input instance.
Why is worst-case complexity useful metric in algorithm analysis?
- useful as pessimistic view
- easy to obtain in analysis and math of it is usually straightforward.
Define:
Best-case complexity
Best-case complexity of an algorithm is a function defined by the minimum number of steps algorithm takes for input instances of any size n.