AM exam Flashcards

1
Q

What is an algorithm

A

It is a well defined procedure that given an input it produces an output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Knuth rules of the algorithm

A

Fitness: It needs to finish at a certain point
Definitness: Each step needs to be clearly defined
Input: It has
0 or more inputs
Output: It has one or more outputs
Effectiveness: It should be composed by basic operation that reach the hoped results (people should reproduce it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Computational Problem

A

Is the mathmetical relation between a set of possible instances and a set of possible solutions. So that for each i there is a possible s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Algorithm analyisis

A

Study the preformance of the algorithm, given two algorithms that do the same operation, which one is faster? which one takes the least amount of operation to get to the same output. We care about its efficiency, that is typically in relation with the input size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Computation Problem - Sorting

A

The sorting computation problem is defined as: given an unsorted list of values (our instance) output the list ordered. There are multiple algorithm to do so:

  • Insertion sort
  • Merge sort
  • Selection sort
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Insertion sort

A

Insertion sort is one of the simplest algorithm to sort an unordered list, it runs with complexity O(n^2). The base concept is:
Start at index 2 (or 1 in python) and check if the element is lower than the previous one, if it is swapped continue moving backward to check if the element is lower than the selected one.
Stop if the element is not lower

example
1,4,3,2
1,4,3,2
1,3,4,2
1,2,3,4

Best case scenario:
The input is ordered so it needs to check only that the element are ordered
T(n) = bn+c (b and c are constants depending on the operation that needs to be taken)

Worst case scenario:
The input is in the inverse order
t(n) = an^2 + bn + c O(n^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Loop invariant

A

Loop invariant is the condition that algorithms needs to meet, it is used to check if the algorithm is correct:

  • It is true prior to the first iteration
  • The hipothesis needs to be correct at each loop (for instance for insertion sort the list on the left needs to be ordered)
  • It is true at termination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Selection Sort

A

This is also a sorting algorithm but it takes a bit of a different path
It start and index 0 or 1 (depening on programming languages) and it search in the list the lowest value and move it at the beginning. So at every step it checks if the first element is lowest than the second, lowest than the third etc etc if it is not lower select the lowest element and keep checking. As soon that you found the smallest you exchange the two values

Example
1,4,3,2
1,2,3,4 Done

Worst case

a*n^2 + bn + c (quadratic)

Best case
Since we always have to loop over the entire list to find the minmum, for every n we have to loop over the entire list.

a*n^2 + bn + c (quadratic) O(n^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Complexity measures

A

Complexity is usually measured as asymptotic complexity, meaning that we care about the algo complexity for a given n number of instances

Big Oh= It defines an upper limit to the algorithm, it defines that the algorithm cannot be slower than this after a certain number of instances

f(x) = O(n)

It is used for worst case analyisis

Big Theta = It defines an upper and a lower boundry after a specific number of instances (Worst case analyisis). For a given set of constant you can find a function that multiplied to these constants sanwitch the function (Tight bound)

Big Omega= It defines the lower boundry after a specific number of instances (Best case analysis). It is not really useful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Divide and conquer paradigm

A

Divide the big task in smaller tasks that are easier to be solved. There are a lot of algorithm that follows this paradigm:

  • Merge sort
  • Matrix multiplication
  • All the graph algorithm

Three steps:

  • Divide
  • Conquer
  • Combine (most difficult part)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Recursive algorithm

A

A recursive algorithm is an algorithm that calls itself multiple times, it is the way we can implement divide and conquer algo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Merge Sort algorithm

A

Always it tries to solve the sorting problem:
It start by dividing the list in two, if the list has more elements than 1 keep dividing until it reaches 1 value at the end.

When it reaches the end we have one value per list (that is ordered of course)

at the combine step we need to order thw two sublist to get an ordered list. So when you combine it you need to add for every index the lowest between the two elements in the sublist. However you can exploit the fact that they are already ordered the two list, so you know that the left most element is the smallest of all of them

example

list 1 = 1,4
list 2 = 3, 4

step 1 = compares 1 and 3 and select 1
step 2 = compares 4 and 3 select 3
step 3 = compares 4 and 4 select the first 4

1,3,4,4

The combine step has O(n) complexity because = O(r-p +1) it scales linearly with the size of n, where r and p are the size of selector to select the subvectors

This means that if you want to measure the running time you need to solve

T(n) =

  • O(1) in case n = 1
  • 2*T(n/2) + BigTheta(n)

Meaning that for every step you need to devide the subproblem in two and then combine it that takes the BigTheta(n) complexity

Worst case BigTheta(n*lg(n))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to solve recurrencies

A

We have:
Recurrence trees = You draw the trees of how the problem is split and depending on the number of leafs and levels you define the algorithm running time
Substituion = You guess the form of the solution and you use induction to prove it (The hardest)
Master theorem = Apply rules in case you are able to apply them to easily solve the problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Matrix multiplication

A

Given two matrix multiply them, it is solved using the divide and conquer paradigm and therefore we apply recursion

Divide step = Let’s consider square matrices, we are able to divide it in 4 submatrices etc etc until we end up haveing a one value matrix

Combine step = We have a new matrix C that is the shape of all the submatrix combined and we fill it with a combination of the submatrices

Execution time

T(n) =

  • BigTheta(1) in case of one element
  • 8 * T(n/2) + BigTheta(n^2)

The combining step is quite expensive, but is not that the part that rules the whole running time (it could also be done by BigTheta(1)

Complexity = BigTheta(n^3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Strassen algo

A

Smart implementation of matrix multiplication that allows to have this recursive formula

T(n)=

  • BigTheta(1)
  • 7*T(n/2) + BigTheta(n^2)

This allows to have a complexity of

BigTheta(n^(lg(7))), quite better than before

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a list

A

It is a sequencial collection of values. It has an index.It can be heterogeneous (values of different type)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a stack

A

A stack is a collection of object that follows the Lifo principle (Last in first out)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a Queue

A

A queue is similar to the stack but it follows the principle of Fifo (First in First out)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a Deque

A

The Deque is a combination of the queue and the stack, you can access both the first element and the last one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a linked list

A

A linked list is a list where items are arranged in linear order. Is different from an array (or list) where the order is determined by the list, here the order is mantained by a pointer.

You have different types:

Single linked list = Every node has a pointer to the next node. The head is teh first element and the tail the last one. You say you ate Traversing the list when you move from the head to the tail. Important: You need to know where the head is because it does not have any pointer to it

Circulary Linked list: Linked list where the tail has a pointer to the head

Dubly linked list: Similar to normal linked list but every node has both a pointer for the next node but also to the previous. It brings more flexibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the difference between arrays and list

A

A part of the fact that arrays cannot be of mixed types the majo differences are:

Arrays are faster to acess an element using the index O(1) while lists are O(n)

Arrays have resize issues

Memory consumptions,

Arrays take 2n space in memory due to the dynamic resizing (you can add and remove), also the memory allocation is contiguous (making it hard for deleting elements)
Single linked list they store n values and n reference while the double 2n references, you do not need to store the values contigously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Positional list

A

Is a list where you are able to refer elements using also non numerical index. So it is a list that tries to overcome the O(n) time to insert an element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a tree

A

A tree is a simple graphical representation of the connection between vertices through edges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a free tree

A

A free tree is a tree that is connected, acyclical and undirected. If the graph is disconnected is a forest.

You can represent a tree as a linked list (double) where you point to the parent and to the child

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Different type of free trees

A

Rooted trees:
Is a tree that has an initial node (Root),. This node is the anchestor of all the other nodes. You can measure the heght of the tree (how many splits it make) and the final nodes are called leafs.

You define the number of children of a node as its degree

Ordered tree: Is a rooted tree where the children nodes are ordered

26
Q

Algorithm to calculate the Depth of the tree

A

given a certain node calculate the depth of it, so in order to understand you move backward and you search for the root, while you do it you keep adding one as soon that you move upward

Worst case:
You have a chain tree (no branches) therefore it takes n step to reach to the root, therefore O(n)

27
Q

Height of the tree

A

Calculating the height of the tree you need to check for eavery leaf the depth and then select the highest

In order to do this we iterate over all the nodes and check whether is a leaf, if it is calculate the depth

Worst case: You have a chained tree that ends with leafs

Since we need to check n nodes and then calculate the depth, the complexity is O(n * n) = O(n^2)

28
Q

How many tree traversal methods there are

A

Tree traversal defines how you visit the tree nodes:

  • Pre-order = You visit the parent and then the trees rotted in the children. You do not explore all the nodes at one level, but you reach the end fast and then you move backward. Complexity = O(n)
  • Post Order = You visit before the children and then the parent, you loop over all the nodes and list all the children, you add them to the queue and as son that there are no children you start visiting. Complexity O(n)
  • Breadth First Traversal = Start from the parent and then you visit all the nodes in the secon layer (depth) as soon that they are finished moved to the 3rd level etc etc
29
Q

Define the Master theorem

A

The master theorem helps you in solving recurrencies that are written in the form

a * T(n/b) + f(n), where a >= 1 and b >=1 and f(n) is asymtotically positive

There are three cases

  • If f(n) = O(n^(log_b(a-e))) for a constant e >0 than T(n) is BigTheta(n^(log_b(a))). However, we need to check if f(n) is polinomially smaller than n^(log_b(a)) by a factor of n^(e)
  • If f(n) is BigTheta (n^log_b(a)) than T(n) = BigTheta(n^(log_b(a) * lg(n))
  • If f(n) = BigOmega(n^(log_b(a-e))) for some constant e>0 and if af(n/b) <= cf(n) for some c<1 then T(n) = BigTheta(f(n)). However, we need to check if f(n) is polinomially smaller than n^(log_b(a)) by a factor of n^(e)
30
Q

What is a Binary Tree

A

The binary tree is a rooted tree where every node has at max two children (right and left node). Left children comes before the right one

The binary tree is full if every node has either 2 children or none

31
Q

What is a Binary search tree

A

Is a Binary tree where each element is associated with a key, and the key of the left child node will be lower or equal than the key of the parent, and the key of the right child node is going to be higher or equal than the parent key.

The in order traversal of the binary tree can be seen as a search from left to right, where we start from the left most node and we move right.

The main operation that you do is search, but you can also do other operation suche as miniumn, maximum, successor (give number of successor), predecessor

The max and the minimum are really easy to find, they are the right most element or the left most element.

The successor is basically the minimum value of the right tree, while teh predecessor is the max of the left tree

32
Q

Searching algorithm in a Binary search tree

A

The search algorithm of the binary search tree, it exploit the proprety of he binary tree, left elements are lower or equal and right element are higher or equal than he predecessor. So we need to move in the tree and check if the value that you are searching is higher or lower than the node you are in.

The worst case complexity is O(h) where h is the height. This happens if you cannot find the element or if it is on the last level.

33
Q

Heap type

A

It is an array that can be represented as a nearly completed binary tree

It has two attributes:

  • Length of the array
  • heap size, the amount of element in the array

Propreties:

  • Root of the heap is stored at the element one of the array
  • Given i the parent is stored at the floor(i/2)
  • the left children at 2*i
  • Right children at 2*1 + 1
  • The minimum number elements in a heap with height h is 2^h
  • The maximum number of elements is 2^(h+1) - 1
  • A heap soring N entries has height log(n)

You can have two type of heap:

  • Max Heap = The parent is always higher than the child
  • Min heap = The parent is always smaller than the parent

Heap are also used to create priority queues, it is used in order to maintain a set of elements. Where you are able to associate to each element a priority key, so you are able to extract the element with max priority.

You can do:

  • Insert = O(n) operation
  • Max = O(1)
  • Extract max, Return and remove the max element (you need to recalculate the max heap) O(lg(n))
34
Q

Max Heapify

A

Max heapify is an algorithm that takes an array and makes it a max heap, so basically it checks if the selected element respect theproprety of the max heap, if it does not it moves it down in the list.

This is done only for one element and not for all of them

The running complexity is = BigTheta(lg n)

35
Q

Building a max heap

A

This algorithm it uses as base the max heapify

Steps:
Start from the last element
Select the parent of the last element
Use max heapify
Move to the next parent node (we know that the parent is at i/2) and redo the process

This brings us to the complexity of the algorithm

We need to call max heap at maximum n times therefore it is O(n*lg(n))

But it can be proved that the build max heap on average run on O(n) becasue the height is always neglegible

36
Q

Heapsort

A

Heap sort is a sorting algorithm that only works with heaps, the advantages is that it runs in O(n * lg(n)) and it stores in-place like insertion and selection sort

Steps:
You take a heap, you call build max heap and you get a max heap
You exchange the root with the last element of the heap and you decrease the heap size
You call again max heapify so you end up with another max heap without one elemen (the previous root)
You keep iterating

Complexity =

  • Buliding the first max heap is a O(n) operation
  • After you have exchanged you need to call a max heapify again on the new heap O(lg(n))

Therefore we have O(n) + O((n-1) * (lg(n)) = O(n*lg(n))

37
Q

Direct-address table

A

It works well if the number of possible keys are of a limited amount. You have a universe of keys and each key is associated with a particular values.

Let’s say that you initialize a direct address table, then you initialize an array of N elements (with Nil) and you associate to some of them values. So the table has the same size as the universe

Not so efficient because you need to have unique values for each element and as well you are storing space for null values

But is really time efficient, because given a key you can easily retrive the element O(1) operation

38
Q

Hash table

A

Hash table is a smart data type that share a similar purpose as the direct address table. However you do not have anymore a Universal table of keys but you have a function that (hash function) that maps the element from the universe U to the hash table. In this case the hash table does not have the same size of the universe, because I have this function that allows me to move from the universe to the hash table. (typically the table has size = to a prime number0

To show the difference from an array

Given a key K we access the array as A[k] for the hash table we access it as A[h[K]]

Of course having a hash table with dimention lower than the dimention of the universe brings some problems.

Possibel problems:
- Collisions : You want to save an element where it already exist

39
Q

Hash function

A

Hash function maps elements from the universe of keys to the hash table.

Proprety:

  • It should be a simple function (fast in computing)
  • It should avoid collisions
  • The keys are distributed evenly among the cells

Good hash functions

  • If the keys are real numbers having n*k as function is good because it allow to have uniform distribution across the slots
  • Division method = We can choose prime numbers that is not too close to a power of 2 is good, you hash based on the remain of the division
40
Q

Handle collision

A

Separate chaining:

You use a linked list to store keys that have collided. So that if there is a collision you know that you just need to search on the linked list of where the colsion was

41
Q

Loading factor

A

The loading factor is a measure of business of the hash table, if you have a high loading factor it means that your hash table is too small. The load factor goes between 0 and 1.

example you have 3 slots and 1 is occupied your loading factor is 1/3. Even if the key is connected to a linked list you still see it as loading factor 1/3

Indeed let’s say that all the n elements are stored in one slot, then the worst case is BigTheta(n) to search the key.

The average searching time depends on how well the hash function distrbutes the elements

42
Q

Uniform hashing

A

We assume that elements are equally likely to hash in any hash table slot

Therefore the expected number of entries in every hash value is n/m where n is the number of keys and m of slots
Therefore the average running time for searching in a hash that is using separate chaining and uniform hashing = O(1 + n/m)

This indeed it brings us to the conclusion that if n is proportional to m than we can get O(1) complexity in searching

43
Q

Open addressing

A

So using separate chaining means using linked list (multiple data types)

With open addressing you keep using the same hash table. This implies that you need a bigger table, you should keep the load factor lower than 0.5

How does it works:

If there is a collision you try other cells to see if they are empty. Of course the way you try other cells is based on functions. For instance you can try the adjacent slots.

Common collision strategy

  • Linear probing :
  • Quadrating probing:
  • Double hashing
44
Q

Linear Probing

A

Is to address collisions
The mapping function is (h(k) + c mod Table size) where c is a value that goes from 0 to infinite. if the slot is occupied c is increased by 1

The average number of cells examined when inserting is
(1 + 1/(1-a)^2)/2

The average number of cells examined when searching (if successful)
(1 + 1/(1-a))/2

The average number of cells examined when searching (if unsuccesfull)
(1 + 1/(1-a)^2)/2

45
Q

Primary clustering

A

Is a phenomenon that affects the Open addressing way of handeling collision, it happens when the factor loading is high, then elements tries to enter in a similar spot of the hash table and elements keep clashing

46
Q

Quadrating proping

A

Similar to linear probing but the function on the constant c is c^2

h(k) + c^2 mod Table size

Propriety

Less likely of getting primary clustering
Always good to have loading factor = 0.5

If loading factor is lower than 0.5 you defenitely will find an empty slot

It suffers from secondary cluster, meaning that if an element hash in the same place it is still a problem

47
Q

Double hashing

A

You have two mapping fucntions h(k) + c*g(k)

48
Q

Rehashing

A

If your loading factor gets too full than you should make the hash table larger, in order to make it larger you have to of course rehash all the elements therefore it is an O(n) operation

49
Q

Dynamic programming

A

It is a fency name for a simple idea (that is quite complicated)
If you have a problem that is hard to solve you solve little piece of it and you save the results in such a way that when you need to solve the bigger problem you do not need to calculate smaller operation again

It is a way to speed up inefficient recursive algorithm because they are called many times to do the same thing

You use hash tables to store intermidiat results

Characteristics

  • There are easy subproblems that can be solved
  • You can reach the optimal solution by optimizing over the subproblems
  • Overlapping subproblems= the subproblems have already been solved somewhere else

There is a tradeoff of time and memory: You store the solution instead of calculating it

50
Q

Rod cutting problem

A

You have a rod of a certain length and you need to decide how to cut it in order to get the highest value out of it

The recursive running time of the rod cutting problem is exponential (untreatable) 2^n

51
Q

Memoization

A

Memoization is the way that dynamic programming stores all the subsolutions. Two ways

Top down memoization:
- Solve the problem as always and we store the subsolution as soon that we encounter them

Bottom up memoization
- We list all the subproblems we solve them and then we start tackling the problem

The algorithm have the same running time that is O(n^2) that is the time of solving all the subproblems and then we need to give to each entry the solution

52
Q

Algorithms that can be solved using Dynamic programming

A

LCS = Longest common subsequence

Single source shortest path = Calculates the shortes weighted path between two nodes of a graph

53
Q

Longest common subsequence problem

A

Given 2 vectors x and y checks which one is the longest subsequence that can be found in x and y. The subsequence does not need to be subsequent

Brute Force:

You can takle the problem as a brute force algorithm, you check all the possible combination, this lead to a complexity of O(n*2^m) where m is the length of the vector x and n is the length of y. The 2^m is because of the fact that x has 2^m subsequences. N because for every value of x you need to test if it belongs to teh subsequence (Recursive algorithm)

More sophisticated approach:

In order to acheive the results we simplify a bit the problem

  • We find the length of the LCS string
  • We return the actual subsequence

First problem:
Find the length of the LCS

Strategy we run the LCS for a subset of x and y starting from the first value. In this way we build the optimal solution by solving small problems

This is possible for a particular proprety

The length of the subsequence at i and j is (let’s define c(i,j) as the length of the subsequence:

  • 0 if i and j are 0
  • c(i-1,j-1)+1 if x[i]=y[j]
  • max(c(i,j-1), c(i-1,j))

It is clear now that we build the global solution with previous ones. We also notice that there are a lot of computation that is repeted when we look at the 3rd case

Example:
In the worst case analysis that we do not have any match we start from LCS(6,6) than we move to LCS(5,6) from there we need to evaluate both LCS(4,6) and LCS(5,5). However also if we go directly to LCS(6,5) than we have to evaluate LCS(5,5) and LCS(6,4). As you can see LCS(5,5) is evaluated twice

The cost of this reevaluation is O(2^(m+n)) because it is a binary tree and the combination are m+n

Using Dynamic programming this allows us to get to a solution in O(n*m) worst case

54
Q

Definition of Graphs

A

It is a group of vertices connected thwough edges

You can have
Undirected Graphs = There is no direction between nodes
Directed Graphs = There is a direction in the graph (from B I go to A but not viceversa)
Weighted graph = Every edge has associated a weight

We say that a graph is dense when the number of edges is = to the number of vertices ^2. We say is sparse if the number of edges is much lower than the number of vertices ^2

Adjecency relation = Is the relation that there is between vertices, which one is connected with which one, if they are not connected they are not adjecent. In Undirected graphs the adjecency is simmetric, if U and V are connected both of them are adjecent

Some terminology

Path = Is the length of the sequence of nodes that needs to pass through in order to get from U to V
Reachable = If two nodes are connected
Simple = If all the vertices in the path are distinct (no passing back from the same node)
Connected = If there is a path between all the nodes
Strongly connected = If every vertices can be reaced from each other without passing by any other node

55
Q

How do you represent graphs

A

In order to represent graphs you have 2 ways

  • Adjacency list = Is a linked list that for each vertex it says which one is connected to it
  • Adjacency matrix = It is a matrix that defines with 1 r 0 whether to vertices are connected

Pros and cons

Adj list:

  • Is really space efficient. The storage requirements is O(V+E) for directed graphs and O(V + 2E) -> O(V+E) for Undirected
  • It is very flexible, can make different type of graphs
  • Seraching if two vertices is not efficient, worst case is O(V) worst case there is connected to all of them except that one

Adj Matrix

  • Listing all vertices is a O(v) (similar to adj list)
  • Determining if 2 vertices are connecte is a O(1) (use the index)
  • You can store weights instead of 1 and 0
  • Is not space efficient, storing it is always O(V^2) becasue you are storing also the 0
56
Q

Searching Algorithm in Graphs

A

Problem definition = Visualize the structure of the graphs to reach all the vertices.

You have two searching algorithm for graphs

Breadth first search:
Input: a random node
Output: BFT and index of discovery

Steps:

  • Start from a given node (at random)
  • Then you look for the adjecient nodes to the one of interest
  • You add these nodes to the queue
  • You move to the first element of the queue
  • You check all the adjecient elements and you add them to the queue
  • You move to the next elemnt of the queue and you keep doing the same

The idea is that you firstly explore all the nodes that are adjecient to the initial node, as soon that all of them are explored you move to the second level and so on

Color coding = White (undiscovered), Grey (discovered), Black (finished)

As output you have the BFT, that is the Breadth first tree, basically it summarize all the predecessor structure of the graph (based of starting at a given node). You only include the predecessor and not all the connections

Running time =

  • You need to initialize the graph (adding + infinite everywhere etc etc) O(v)
  • You need to explore all the nodes O(V)
  • And for each node you have to check all its connection O(E)

This brings it to be O(2V +E) -> O(V+E)

Depth First Search

Output:

  • Index to identify the discovery time, and one to identify the finishing time
  • Predecessor list

Steps:

  • Start from the first node
  • Check all the adjecient nodes and put them into a stack
  • Move to the first element of the stack and add the adjacent nodes to the stack and so on

The index are useful to understand the steps that you are taking in order to actually get to the particular node

Complexity = O(E+V)

There is also a definition for the connection
Tree edge = The edges that you found to explore a new node
back edge = An edge that goes to a node that is its ancestor
Forward edge = Ad edge that goes to a node that is its descendent
Corss edge = Everything else

57
Q

Shortest Path problem

A

Problem = Calculate the shortest weighted path between two nodes, useful to measure the degree of separation

The weighted path is the sum of the weights associated in the edges of the path. So you want to minimize the sum of weights

Before solvng this is useful to say that the weighted shortest path should also be the shortest path. And it also might be that it does not exist (if we consider negative weights)

Dijkstra Shortest path solution

It is a greedy approach to the problem (greedy means that it optimize on small solutions).

Idea= move to the node that has associated the lowest weighted edges. And keep tracks of that weight by keeping a comulative sum of the weight from the start node). It choose always the lowest weighted cumulative path. You pass in all nodes from the shortest path. Very difficult to prove that it works.

Steps:

  • You start from a node
  • You update the weight of the adjecient nodes as the sum of the edges needed to pass through to get to that node
  • You then move to the node that has associated the shortest weighted path and you remove the previous node from the queue
  • You then update the predecessor
  • You then do all the previous steps and you do it untill the queue is empty (meaning that you have visited all the nodes)

Algorithm complexity:
It depends on the data type that you use
- O(v^2) if you are using a min priority heap
- O((V+E)lg(V)) If is using a min binary heap it reduces to O(Elg(V)) if all vertices are reachable from S

If the number of edges is lower than V^2/lg(V). then use binary mean heap

58
Q

Minimum spanning tree

A

Problem: Find the best root to visit all nodes

Prims’s algorithm, you simply move towards the smallest weight at each step, without considering previous weights. Negative weights are allowed

Complexty=
O(Vlog(V) + Elog(V)) = O(E*log(V))

Steps

  • Start from a node
  • Check the adjacency list of the node and update the weight for the adjacent node
  • Update the predecessor list
  • Calculate the mimimum weight of all the nodes
  • Move to that node and start all over again
59
Q

Shortest path between all pairs

A

Problem = Find the shortest path between all pairs of nodes

You can have negative weights but not negative cycle (like for the Dijkstra problem)

The algorithm breaks the problem is subproblems

The complexity is O(V^3)

Steps:

Look at the video

60
Q

Recursive tree important things to remember

A

Levels = height -1
Height = log_b(n), where b is the number of subproblems that you create
Number of leaves = n^(lg_b(a))

How to solve it = Find the general formula of the level cost and then sum it for the number of levels + a bigtheta of (n_leaves)

Series:
sum(x^k) = 1/(1-x)