AM exam Flashcards
What is an algorithm
It is a well defined procedure that given an input it produces an output
Knuth rules of the algorithm
Fitness: It needs to finish at a certain point
Definitness: Each step needs to be clearly defined
Input: It has
0 or more inputs
Output: It has one or more outputs
Effectiveness: It should be composed by basic operation that reach the hoped results (people should reproduce it)
Computational Problem
Is the mathmetical relation between a set of possible instances and a set of possible solutions. So that for each i there is a possible s
Algorithm analyisis
Study the preformance of the algorithm, given two algorithms that do the same operation, which one is faster? which one takes the least amount of operation to get to the same output. We care about its efficiency, that is typically in relation with the input size
Computation Problem - Sorting
The sorting computation problem is defined as: given an unsorted list of values (our instance) output the list ordered. There are multiple algorithm to do so:
- Insertion sort
- Merge sort
- Selection sort
Insertion sort
Insertion sort is one of the simplest algorithm to sort an unordered list, it runs with complexity O(n^2). The base concept is:
Start at index 2 (or 1 in python) and check if the element is lower than the previous one, if it is swapped continue moving backward to check if the element is lower than the selected one.
Stop if the element is not lower
example 1,4,3,2 1,4,3,2 1,3,4,2 1,2,3,4
Best case scenario:
The input is ordered so it needs to check only that the element are ordered
T(n) = bn+c (b and c are constants depending on the operation that needs to be taken)
Worst case scenario:
The input is in the inverse order
t(n) = an^2 + bn + c O(n^2)
Loop invariant
Loop invariant is the condition that algorithms needs to meet, it is used to check if the algorithm is correct:
- It is true prior to the first iteration
- The hipothesis needs to be correct at each loop (for instance for insertion sort the list on the left needs to be ordered)
- It is true at termination
Selection Sort
This is also a sorting algorithm but it takes a bit of a different path
It start and index 0 or 1 (depening on programming languages) and it search in the list the lowest value and move it at the beginning. So at every step it checks if the first element is lowest than the second, lowest than the third etc etc if it is not lower select the lowest element and keep checking. As soon that you found the smallest you exchange the two values
Example
1,4,3,2
1,2,3,4 Done
Worst case
a*n^2 + bn + c (quadratic)
Best case
Since we always have to loop over the entire list to find the minmum, for every n we have to loop over the entire list.
a*n^2 + bn + c (quadratic) O(n^2)
Complexity measures
Complexity is usually measured as asymptotic complexity, meaning that we care about the algo complexity for a given n number of instances
Big Oh= It defines an upper limit to the algorithm, it defines that the algorithm cannot be slower than this after a certain number of instances
f(x) = O(n)
It is used for worst case analyisis
Big Theta = It defines an upper and a lower boundry after a specific number of instances (Worst case analyisis). For a given set of constant you can find a function that multiplied to these constants sanwitch the function (Tight bound)
Big Omega= It defines the lower boundry after a specific number of instances (Best case analysis). It is not really useful
Divide and conquer paradigm
Divide the big task in smaller tasks that are easier to be solved. There are a lot of algorithm that follows this paradigm:
- Merge sort
- Matrix multiplication
- All the graph algorithm
Three steps:
- Divide
- Conquer
- Combine (most difficult part)
Recursive algorithm
A recursive algorithm is an algorithm that calls itself multiple times, it is the way we can implement divide and conquer algo
Merge Sort algorithm
Always it tries to solve the sorting problem:
It start by dividing the list in two, if the list has more elements than 1 keep dividing until it reaches 1 value at the end.
When it reaches the end we have one value per list (that is ordered of course)
at the combine step we need to order thw two sublist to get an ordered list. So when you combine it you need to add for every index the lowest between the two elements in the sublist. However you can exploit the fact that they are already ordered the two list, so you know that the left most element is the smallest of all of them
example
list 1 = 1,4
list 2 = 3, 4
step 1 = compares 1 and 3 and select 1
step 2 = compares 4 and 3 select 3
step 3 = compares 4 and 4 select the first 4
1,3,4,4
The combine step has O(n) complexity because = O(r-p +1) it scales linearly with the size of n, where r and p are the size of selector to select the subvectors
This means that if you want to measure the running time you need to solve
T(n) =
- O(1) in case n = 1
- 2*T(n/2) + BigTheta(n)
Meaning that for every step you need to devide the subproblem in two and then combine it that takes the BigTheta(n) complexity
Worst case BigTheta(n*lg(n))
How to solve recurrencies
We have:
Recurrence trees = You draw the trees of how the problem is split and depending on the number of leafs and levels you define the algorithm running time
Substituion = You guess the form of the solution and you use induction to prove it (The hardest)
Master theorem = Apply rules in case you are able to apply them to easily solve the problem
Matrix multiplication
Given two matrix multiply them, it is solved using the divide and conquer paradigm and therefore we apply recursion
Divide step = Let’s consider square matrices, we are able to divide it in 4 submatrices etc etc until we end up haveing a one value matrix
Combine step = We have a new matrix C that is the shape of all the submatrix combined and we fill it with a combination of the submatrices
Execution time
T(n) =
- BigTheta(1) in case of one element
- 8 * T(n/2) + BigTheta(n^2)
The combining step is quite expensive, but is not that the part that rules the whole running time (it could also be done by BigTheta(1)
Complexity = BigTheta(n^3)
Strassen algo
Smart implementation of matrix multiplication that allows to have this recursive formula
T(n)=
- BigTheta(1)
- 7*T(n/2) + BigTheta(n^2)
This allows to have a complexity of
BigTheta(n^(lg(7))), quite better than before
What is a list
It is a sequencial collection of values. It has an index.It can be heterogeneous (values of different type)
What is a stack
A stack is a collection of object that follows the Lifo principle (Last in first out)
What is a Queue
A queue is similar to the stack but it follows the principle of Fifo (First in First out)
What is a Deque
The Deque is a combination of the queue and the stack, you can access both the first element and the last one
What is a linked list
A linked list is a list where items are arranged in linear order. Is different from an array (or list) where the order is determined by the list, here the order is mantained by a pointer.
You have different types:
Single linked list = Every node has a pointer to the next node. The head is teh first element and the tail the last one. You say you ate Traversing the list when you move from the head to the tail. Important: You need to know where the head is because it does not have any pointer to it
Circulary Linked list: Linked list where the tail has a pointer to the head
Dubly linked list: Similar to normal linked list but every node has both a pointer for the next node but also to the previous. It brings more flexibility
What is the difference between arrays and list
A part of the fact that arrays cannot be of mixed types the majo differences are:
Arrays are faster to acess an element using the index O(1) while lists are O(n)
Arrays have resize issues
Memory consumptions,
Arrays take 2n space in memory due to the dynamic resizing (you can add and remove), also the memory allocation is contiguous (making it hard for deleting elements)
Single linked list they store n values and n reference while the double 2n references, you do not need to store the values contigously
Positional list
Is a list where you are able to refer elements using also non numerical index. So it is a list that tries to overcome the O(n) time to insert an element
What is a tree
A tree is a simple graphical representation of the connection between vertices through edges.
What is a free tree
A free tree is a tree that is connected, acyclical and undirected. If the graph is disconnected is a forest.
You can represent a tree as a linked list (double) where you point to the parent and to the child
Different type of free trees
Rooted trees:
Is a tree that has an initial node (Root),. This node is the anchestor of all the other nodes. You can measure the heght of the tree (how many splits it make) and the final nodes are called leafs.
You define the number of children of a node as its degree
Ordered tree: Is a rooted tree where the children nodes are ordered
Algorithm to calculate the Depth of the tree
given a certain node calculate the depth of it, so in order to understand you move backward and you search for the root, while you do it you keep adding one as soon that you move upward
Worst case:
You have a chain tree (no branches) therefore it takes n step to reach to the root, therefore O(n)
Height of the tree
Calculating the height of the tree you need to check for eavery leaf the depth and then select the highest
In order to do this we iterate over all the nodes and check whether is a leaf, if it is calculate the depth
Worst case: You have a chained tree that ends with leafs
Since we need to check n nodes and then calculate the depth, the complexity is O(n * n) = O(n^2)
How many tree traversal methods there are
Tree traversal defines how you visit the tree nodes:
- Pre-order = You visit the parent and then the trees rotted in the children. You do not explore all the nodes at one level, but you reach the end fast and then you move backward. Complexity = O(n)
- Post Order = You visit before the children and then the parent, you loop over all the nodes and list all the children, you add them to the queue and as son that there are no children you start visiting. Complexity O(n)
- Breadth First Traversal = Start from the parent and then you visit all the nodes in the secon layer (depth) as soon that they are finished moved to the 3rd level etc etc
Define the Master theorem
The master theorem helps you in solving recurrencies that are written in the form
a * T(n/b) + f(n), where a >= 1 and b >=1 and f(n) is asymtotically positive
There are three cases
- If f(n) = O(n^(log_b(a-e))) for a constant e >0 than T(n) is BigTheta(n^(log_b(a))). However, we need to check if f(n) is polinomially smaller than n^(log_b(a)) by a factor of n^(e)
- If f(n) is BigTheta (n^log_b(a)) than T(n) = BigTheta(n^(log_b(a) * lg(n))
- If f(n) = BigOmega(n^(log_b(a-e))) for some constant e>0 and if af(n/b) <= cf(n) for some c<1 then T(n) = BigTheta(f(n)). However, we need to check if f(n) is polinomially smaller than n^(log_b(a)) by a factor of n^(e)
What is a Binary Tree
The binary tree is a rooted tree where every node has at max two children (right and left node). Left children comes before the right one
The binary tree is full if every node has either 2 children or none
What is a Binary search tree
Is a Binary tree where each element is associated with a key, and the key of the left child node will be lower or equal than the key of the parent, and the key of the right child node is going to be higher or equal than the parent key.
The in order traversal of the binary tree can be seen as a search from left to right, where we start from the left most node and we move right.
The main operation that you do is search, but you can also do other operation suche as miniumn, maximum, successor (give number of successor), predecessor
The max and the minimum are really easy to find, they are the right most element or the left most element.
The successor is basically the minimum value of the right tree, while teh predecessor is the max of the left tree
Searching algorithm in a Binary search tree
The search algorithm of the binary search tree, it exploit the proprety of he binary tree, left elements are lower or equal and right element are higher or equal than he predecessor. So we need to move in the tree and check if the value that you are searching is higher or lower than the node you are in.
The worst case complexity is O(h) where h is the height. This happens if you cannot find the element or if it is on the last level.
Heap type
It is an array that can be represented as a nearly completed binary tree
It has two attributes:
- Length of the array
- heap size, the amount of element in the array
Propreties:
- Root of the heap is stored at the element one of the array
- Given i the parent is stored at the floor(i/2)
- the left children at 2*i
- Right children at 2*1 + 1
- The minimum number elements in a heap with height h is 2^h
- The maximum number of elements is 2^(h+1) - 1
- A heap soring N entries has height log(n)
You can have two type of heap:
- Max Heap = The parent is always higher than the child
- Min heap = The parent is always smaller than the parent
Heap are also used to create priority queues, it is used in order to maintain a set of elements. Where you are able to associate to each element a priority key, so you are able to extract the element with max priority.
You can do:
- Insert = O(n) operation
- Max = O(1)
- Extract max, Return and remove the max element (you need to recalculate the max heap) O(lg(n))
Max Heapify
Max heapify is an algorithm that takes an array and makes it a max heap, so basically it checks if the selected element respect theproprety of the max heap, if it does not it moves it down in the list.
This is done only for one element and not for all of them
The running complexity is = BigTheta(lg n)
Building a max heap
This algorithm it uses as base the max heapify
Steps: Start from the last element Select the parent of the last element Use max heapify Move to the next parent node (we know that the parent is at i/2) and redo the process
This brings us to the complexity of the algorithm
We need to call max heap at maximum n times therefore it is O(n*lg(n))
But it can be proved that the build max heap on average run on O(n) becasue the height is always neglegible
Heapsort
Heap sort is a sorting algorithm that only works with heaps, the advantages is that it runs in O(n * lg(n)) and it stores in-place like insertion and selection sort
Steps:
You take a heap, you call build max heap and you get a max heap
You exchange the root with the last element of the heap and you decrease the heap size
You call again max heapify so you end up with another max heap without one elemen (the previous root)
You keep iterating
Complexity =
- Buliding the first max heap is a O(n) operation
- After you have exchanged you need to call a max heapify again on the new heap O(lg(n))
Therefore we have O(n) + O((n-1) * (lg(n)) = O(n*lg(n))
Direct-address table
It works well if the number of possible keys are of a limited amount. You have a universe of keys and each key is associated with a particular values.
Let’s say that you initialize a direct address table, then you initialize an array of N elements (with Nil) and you associate to some of them values. So the table has the same size as the universe
Not so efficient because you need to have unique values for each element and as well you are storing space for null values
But is really time efficient, because given a key you can easily retrive the element O(1) operation
Hash table
Hash table is a smart data type that share a similar purpose as the direct address table. However you do not have anymore a Universal table of keys but you have a function that (hash function) that maps the element from the universe U to the hash table. In this case the hash table does not have the same size of the universe, because I have this function that allows me to move from the universe to the hash table. (typically the table has size = to a prime number0
To show the difference from an array
Given a key K we access the array as A[k] for the hash table we access it as A[h[K]]
Of course having a hash table with dimention lower than the dimention of the universe brings some problems.
Possibel problems:
- Collisions : You want to save an element where it already exist
Hash function
Hash function maps elements from the universe of keys to the hash table.
Proprety:
- It should be a simple function (fast in computing)
- It should avoid collisions
- The keys are distributed evenly among the cells
Good hash functions
- If the keys are real numbers having n*k as function is good because it allow to have uniform distribution across the slots
- Division method = We can choose prime numbers that is not too close to a power of 2 is good, you hash based on the remain of the division
Handle collision
Separate chaining:
You use a linked list to store keys that have collided. So that if there is a collision you know that you just need to search on the linked list of where the colsion was
Loading factor
The loading factor is a measure of business of the hash table, if you have a high loading factor it means that your hash table is too small. The load factor goes between 0 and 1.
example you have 3 slots and 1 is occupied your loading factor is 1/3. Even if the key is connected to a linked list you still see it as loading factor 1/3
Indeed let’s say that all the n elements are stored in one slot, then the worst case is BigTheta(n) to search the key.
The average searching time depends on how well the hash function distrbutes the elements
Uniform hashing
We assume that elements are equally likely to hash in any hash table slot
Therefore the expected number of entries in every hash value is n/m where n is the number of keys and m of slots
Therefore the average running time for searching in a hash that is using separate chaining and uniform hashing = O(1 + n/m)
This indeed it brings us to the conclusion that if n is proportional to m than we can get O(1) complexity in searching
Open addressing
So using separate chaining means using linked list (multiple data types)
With open addressing you keep using the same hash table. This implies that you need a bigger table, you should keep the load factor lower than 0.5
How does it works:
If there is a collision you try other cells to see if they are empty. Of course the way you try other cells is based on functions. For instance you can try the adjacent slots.
Common collision strategy
- Linear probing :
- Quadrating probing:
- Double hashing
Linear Probing
Is to address collisions
The mapping function is (h(k) + c mod Table size) where c is a value that goes from 0 to infinite. if the slot is occupied c is increased by 1
The average number of cells examined when inserting is
(1 + 1/(1-a)^2)/2
The average number of cells examined when searching (if successful)
(1 + 1/(1-a))/2
The average number of cells examined when searching (if unsuccesfull)
(1 + 1/(1-a)^2)/2
Primary clustering
Is a phenomenon that affects the Open addressing way of handeling collision, it happens when the factor loading is high, then elements tries to enter in a similar spot of the hash table and elements keep clashing
Quadrating proping
Similar to linear probing but the function on the constant c is c^2
h(k) + c^2 mod Table size
Propriety
Less likely of getting primary clustering
Always good to have loading factor = 0.5
If loading factor is lower than 0.5 you defenitely will find an empty slot
It suffers from secondary cluster, meaning that if an element hash in the same place it is still a problem
Double hashing
You have two mapping fucntions h(k) + c*g(k)
Rehashing
If your loading factor gets too full than you should make the hash table larger, in order to make it larger you have to of course rehash all the elements therefore it is an O(n) operation
Dynamic programming
It is a fency name for a simple idea (that is quite complicated)
If you have a problem that is hard to solve you solve little piece of it and you save the results in such a way that when you need to solve the bigger problem you do not need to calculate smaller operation again
It is a way to speed up inefficient recursive algorithm because they are called many times to do the same thing
You use hash tables to store intermidiat results
Characteristics
- There are easy subproblems that can be solved
- You can reach the optimal solution by optimizing over the subproblems
- Overlapping subproblems= the subproblems have already been solved somewhere else
There is a tradeoff of time and memory: You store the solution instead of calculating it
Rod cutting problem
You have a rod of a certain length and you need to decide how to cut it in order to get the highest value out of it
The recursive running time of the rod cutting problem is exponential (untreatable) 2^n
Memoization
Memoization is the way that dynamic programming stores all the subsolutions. Two ways
Top down memoization:
- Solve the problem as always and we store the subsolution as soon that we encounter them
Bottom up memoization
- We list all the subproblems we solve them and then we start tackling the problem
The algorithm have the same running time that is O(n^2) that is the time of solving all the subproblems and then we need to give to each entry the solution
Algorithms that can be solved using Dynamic programming
LCS = Longest common subsequence
Single source shortest path = Calculates the shortes weighted path between two nodes of a graph
Longest common subsequence problem
Given 2 vectors x and y checks which one is the longest subsequence that can be found in x and y. The subsequence does not need to be subsequent
Brute Force:
You can takle the problem as a brute force algorithm, you check all the possible combination, this lead to a complexity of O(n*2^m) where m is the length of the vector x and n is the length of y. The 2^m is because of the fact that x has 2^m subsequences. N because for every value of x you need to test if it belongs to teh subsequence (Recursive algorithm)
More sophisticated approach:
In order to acheive the results we simplify a bit the problem
- We find the length of the LCS string
- We return the actual subsequence
First problem:
Find the length of the LCS
Strategy we run the LCS for a subset of x and y starting from the first value. In this way we build the optimal solution by solving small problems
This is possible for a particular proprety
The length of the subsequence at i and j is (let’s define c(i,j) as the length of the subsequence:
- 0 if i and j are 0
- c(i-1,j-1)+1 if x[i]=y[j]
- max(c(i,j-1), c(i-1,j))
It is clear now that we build the global solution with previous ones. We also notice that there are a lot of computation that is repeted when we look at the 3rd case
Example:
In the worst case analysis that we do not have any match we start from LCS(6,6) than we move to LCS(5,6) from there we need to evaluate both LCS(4,6) and LCS(5,5). However also if we go directly to LCS(6,5) than we have to evaluate LCS(5,5) and LCS(6,4). As you can see LCS(5,5) is evaluated twice
The cost of this reevaluation is O(2^(m+n)) because it is a binary tree and the combination are m+n
Using Dynamic programming this allows us to get to a solution in O(n*m) worst case
Definition of Graphs
It is a group of vertices connected thwough edges
You can have
Undirected Graphs = There is no direction between nodes
Directed Graphs = There is a direction in the graph (from B I go to A but not viceversa)
Weighted graph = Every edge has associated a weight
We say that a graph is dense when the number of edges is = to the number of vertices ^2. We say is sparse if the number of edges is much lower than the number of vertices ^2
Adjecency relation = Is the relation that there is between vertices, which one is connected with which one, if they are not connected they are not adjecent. In Undirected graphs the adjecency is simmetric, if U and V are connected both of them are adjecent
Some terminology
Path = Is the length of the sequence of nodes that needs to pass through in order to get from U to V
Reachable = If two nodes are connected
Simple = If all the vertices in the path are distinct (no passing back from the same node)
Connected = If there is a path between all the nodes
Strongly connected = If every vertices can be reaced from each other without passing by any other node
How do you represent graphs
In order to represent graphs you have 2 ways
- Adjacency list = Is a linked list that for each vertex it says which one is connected to it
- Adjacency matrix = It is a matrix that defines with 1 r 0 whether to vertices are connected
Pros and cons
Adj list:
- Is really space efficient. The storage requirements is O(V+E) for directed graphs and O(V + 2E) -> O(V+E) for Undirected
- It is very flexible, can make different type of graphs
- Seraching if two vertices is not efficient, worst case is O(V) worst case there is connected to all of them except that one
Adj Matrix
- Listing all vertices is a O(v) (similar to adj list)
- Determining if 2 vertices are connecte is a O(1) (use the index)
- You can store weights instead of 1 and 0
- Is not space efficient, storing it is always O(V^2) becasue you are storing also the 0
Searching Algorithm in Graphs
Problem definition = Visualize the structure of the graphs to reach all the vertices.
You have two searching algorithm for graphs
Breadth first search:
Input: a random node
Output: BFT and index of discovery
Steps:
- Start from a given node (at random)
- Then you look for the adjecient nodes to the one of interest
- You add these nodes to the queue
- You move to the first element of the queue
- You check all the adjecient elements and you add them to the queue
- You move to the next elemnt of the queue and you keep doing the same
The idea is that you firstly explore all the nodes that are adjecient to the initial node, as soon that all of them are explored you move to the second level and so on
Color coding = White (undiscovered), Grey (discovered), Black (finished)
As output you have the BFT, that is the Breadth first tree, basically it summarize all the predecessor structure of the graph (based of starting at a given node). You only include the predecessor and not all the connections
Running time =
- You need to initialize the graph (adding + infinite everywhere etc etc) O(v)
- You need to explore all the nodes O(V)
- And for each node you have to check all its connection O(E)
This brings it to be O(2V +E) -> O(V+E)
Depth First Search
Output:
- Index to identify the discovery time, and one to identify the finishing time
- Predecessor list
Steps:
- Start from the first node
- Check all the adjecient nodes and put them into a stack
- Move to the first element of the stack and add the adjacent nodes to the stack and so on
The index are useful to understand the steps that you are taking in order to actually get to the particular node
Complexity = O(E+V)
There is also a definition for the connection
Tree edge = The edges that you found to explore a new node
back edge = An edge that goes to a node that is its ancestor
Forward edge = Ad edge that goes to a node that is its descendent
Corss edge = Everything else
Shortest Path problem
Problem = Calculate the shortest weighted path between two nodes, useful to measure the degree of separation
The weighted path is the sum of the weights associated in the edges of the path. So you want to minimize the sum of weights
Before solvng this is useful to say that the weighted shortest path should also be the shortest path. And it also might be that it does not exist (if we consider negative weights)
Dijkstra Shortest path solution
It is a greedy approach to the problem (greedy means that it optimize on small solutions).
Idea= move to the node that has associated the lowest weighted edges. And keep tracks of that weight by keeping a comulative sum of the weight from the start node). It choose always the lowest weighted cumulative path. You pass in all nodes from the shortest path. Very difficult to prove that it works.
Steps:
- You start from a node
- You update the weight of the adjecient nodes as the sum of the edges needed to pass through to get to that node
- You then move to the node that has associated the shortest weighted path and you remove the previous node from the queue
- You then update the predecessor
- You then do all the previous steps and you do it untill the queue is empty (meaning that you have visited all the nodes)
Algorithm complexity:
It depends on the data type that you use
- O(v^2) if you are using a min priority heap
- O((V+E)lg(V)) If is using a min binary heap it reduces to O(Elg(V)) if all vertices are reachable from S
If the number of edges is lower than V^2/lg(V). then use binary mean heap
Minimum spanning tree
Problem: Find the best root to visit all nodes
Prims’s algorithm, you simply move towards the smallest weight at each step, without considering previous weights. Negative weights are allowed
Complexty=
O(Vlog(V) + Elog(V)) = O(E*log(V))
Steps
- Start from a node
- Check the adjacency list of the node and update the weight for the adjacent node
- Update the predecessor list
- Calculate the mimimum weight of all the nodes
- Move to that node and start all over again
Shortest path between all pairs
Problem = Find the shortest path between all pairs of nodes
You can have negative weights but not negative cycle (like for the Dijkstra problem)
The algorithm breaks the problem is subproblems
The complexity is O(V^3)
Steps:
Look at the video
Recursive tree important things to remember
Levels = height -1
Height = log_b(n), where b is the number of subproblems that you create
Number of leaves = n^(lg_b(a))
How to solve it = Find the general formula of the level cost and then sum it for the number of levels + a bigtheta of (n_leaves)
Series:
sum(x^k) = 1/(1-x)