CHAPTER 3: Algorithms Flashcards
Weights
consider pairs (bold(G),w) where bold(G) = (V,E) is a connected graph and w: E → N_0.
For each edge e∈E, the quantity w(e) is called the weight of e
Given a set S of edges
The weight of S:
w(S) = sum of e∈S of [w(e)]
TREE
connected graph without cycles
ISOMORPHISM CLASSES OF TREES ON n-VERTICES
counting isomers ie these aren’t isomorphisms to each other (they are classes of different types)
n = 1 • n = 2 •-• n = 3 \_\_ | | •-•-• •-• • (1,2,1) ~(2,1,1)
(does this count as 1-YES)
n = 4 •-•-•-• • >•-• • (1,2,2,1) (1,1,3,1)
n = 5 • •-•-•-•-• • | >•-•-• •-•-• • | •
(1,2,2,2,1) (1,1,3,2,1) (1,1,1,1,4)
8??
LABELLED TREES: KILLING ISOMORPHISMS
n=1
•1
n=1
•1-•2
n=3 •1-•2-•3 •2-•1-•3 •1-•3-•2 3 labelled trees :
n=4
•-•-•-•
has 4!/2=12 different labelled trees
(4! ways to label but some same graph due to symmetries)
• | • / \ • •
only choosing the middle number and permute the other 3
4!/3!=4 labelled trees
16 labelled trees
IN GENERAL LABELLED TREES WITH N VERTICES
(n!)/ |Automorphism group of T|
IN GENERAL LABELLED TREES WITH n VERTICES
(n!)/(#symmetris)
n=5
•-•-•-•-• 120/2=60
• | • | • / \ • • 20/2=60
• | •-•-• | • 5!/4!=3 --- 125 labelled trees
unlabelled vs labelled trees
n: unlabelled:labelled
1: 1:1
2: 1:1
3: 1:3
4: 2:16
5: 3:125
6: 6:1296
7: 11:16807
*vertices are no longer interchangeable once labelled this implies that are n! ways to label an unlabelled tree minus the symmetries
CAYLEYS FORMULA
n^(n-2) labelled trees on n vertices
Pr¨ufer code
a unique sequence associated with a labelled tree
Definition3.4.4
Prüfer code If record the edges of a tree
T as in the Pruning Algorithm, the first n−2 number appear in the top row is the Prüfer code of T
- A labelled trees Pr¨ufer code is obtained: by removing the leaf with the smallest label and set the ith element of the prufer sequence to be the label of this lease neighbour. for a labelled tree this is unique and has length n-2
- the proof of code is in bijection and we can construct the tree from the code
- prufer codes can be used to prove Curley’s formula
- some form will be on the exam
PR¨UFER CODE:
algorithm
1) find the lowest leaf l of T
2) record Edge e connecting it to the rest of the tree
3) delete l & e to get a simpler tree T’
4) repeat process with T’
* no number will appear twice in the bottom row
Output: A 2×n−1 table with entries in {1,…,n} that records the edges of T in a specified order.
Find the leaf v with the lowest label; it will have one edge e, connecting it to some vertex (its “parent”) w
Form a new tree T′ by deleting v and e, and record e in the output table, putting the deleted vertex
v in the bottom row and its parent w above it in the top row.
PR¨UFER CODE:
algorithm EXAMPLE
given 9 Vertex labelled tree, 8 edges
[7]\ [8]---[9] \ | [4]---[5] [6] / | / | [1 ]/ [2]/ [3]
Parent: 5|6|5|2|5|5|8|9|
Leaf: 1 |3|4|6|2|7|5|8|
1*removing Leaf i.e. Edge 1-5, leaves us with lowest leaves 3479- sometimes new leaves appear eg 2 only after 3 and 6(new leaf) removed
PR¨UFER CODE: A BIJECTION
we can construct the tree from the code:
given any sequence of n−2 numbers between 1 and n, we can construct a tree have that sequence as its Prüfer code.
Parent: 5|6|5|2|5|5|8|9|
Leaf: 1 |3|4|6|2|7|5|8|
- any # appearing as a parent isn’t a leaf this implies one is the lowest # which doesn’t appear so it’s Leaf. The first we deleted. Than the next lowest number that isn’t a leaf. But we go deleting so next is 6.
- finally numbers we have not used for edge col with 9 and 8 added
Given pru¨fer code example:
Parent: 4|2|2|8|8|3|7|9|
Leaf: | | | | | | | | |
*by Finding lowest # not in remaining code delete the first column and iterate , last 2 spaces left are last Edge
Parent: 4|2|2|8|8|3|7|9|
Leaf: 1|4|5|2|6|8|3|7|
drawing
tree with 9 vertices, leaves 1,5,6,9
**lowest number not in first row, ignore the previous top and bottom row numbers and write the lowest
CAYLEYS ENRICHMENT
Degree of vertex i=
#times i appears in table
= # times i appears in PRUFER CODE +1
COROLLARY labelled trees with degrees
the # of labelled trees on n vertices where for each i, Vertex i has degree d_i is
(n-2)!/ [pi (d_i-1)!]
*consistent with hand shaking
PRUFER CODE EXAMPLE:
Draw the tree whose Pr¨ufer code is (1, 1, 1, 1, 6, 5).
*The given Pr¨ufer code has six entries, therefore the corresponding
tree will have 6 + 2 = 8 entries.
The first number in the Pr¨ufer code is 1 and the lowest number not included in the Pr¨ufer code is 2, so we connect 1 to 2.
*We then drop the leading 1 in the code and put 2 at the back of the code:
(1, 1, 1, 6, 5, 2).
The first number in the code is still 1 and the lowest number not included is now
3, so we connect 1 to 3.
etc
*last numbers missing are 5 and 8 so we connect them
Parent: 1| 1| 1| 1|6|5|5|
Leaf: 2|3|4|7| 1|6|8|
- 1 connected to 2,7,3,4,6
then 6 connected to 5 which is connected to 8
ie leaves are 2,7,3,4,8
8 vertices
example degrees and trees
*trees on 6 vertices where vertices 1 and 3 have degree 3 everything else is a leaf
- prufer code where
1 appears 3-1 = 2
3 appears 2-_ times
*1331,11331,3311,
4!/(2!2!) ways to arrange 4 #s
n-2 #s : ways to arrange two of each of two different #s
ie permute x2
prufer code and degree notes
- for the table every column is an edge
- so looking at the table (prev example) the degree of 5 is 5, as 5 appears 5 times
- from the code 1 and 5 didn’t appear previously
- two appears twice and has degree 3
SPANNING TREES
*minimal connected graphs and spanning trees are minimal subgraphs containing all vertices that are still connected
Let G be a connected graph. A spanning tree T of G is a subgraph T⊆ G st T is a tree and contains every Vertex of G.
*found by deleting edges and preserving connectedness
example 3 vertices triangle
spanning trees are ABC
st different edges but still connected
3 spanning trees: A-B-C C-A-B and A-C-B
Spanning trees of K_n
Spanning trees of K_n are the same as label trees on n vertices, i.e. any tree on the n vertices
K_5 is pentagon with star and a spanning tree is C_5 for example, others include any other edges st vertices are connected
weighted graph encoded:
[A] [4][B] [5][7][C] [9][8][8][D] [5][5][5][8][E]
[A] [4][B] [5][7][C] [9][8][8][D] [5][5][5][8][E]
drawing: edge from A-B weight 4, edge from A to C weight 5,…
pentagon with star in the middle ABCDE - ie complete
outside weights 4,7,8,8,5
Weighted graphs
cost associated to edge a weighted graph is a graph together with a weight function we assume the weight is bigger than or equal to 0 for all edges
for example we look for the cheapest way to connect all cities that is we look for its minimal spanning tree
checking every spanning tree is too slow
for example K_n has n^{n-2} spanning trees
KRUSKALS ALGORITHM
- greedy algorithm- as we don’t plan ahead
- avoids Cycles
- looks at all edges at the start
- T may be disconnected at intermediate steps
we start with t having no edges:
•look at cheapest remaining Edge e
•if adding e to T creates a loop discard e
•otherwise add e to T
KRUSKALS ALGORITHM: EXAMPLE
[A] [4][B] [5][7][C] [9][8][8][D] [5][5][5][8][E]
•cheapest edges: AB costs 4 CE costs 5 BE costs 5 {WE costs 5 but loops, AC costs 5 but loops, BC costs 7 but loops} DE costs 8
TOTAL COST= 4+5+5+8=22
Meanwhile drawing MST:
edges included and ensure its connected
THUS 22 is the total cost of the cheapest spanning tree
PRIMS ALGORTHIM
*start at one Vertex and explore
- Start T=v, a single vertex, iteratively:
- find the cheapest edge e=v-w from v ∈T to w∉T
- add e & w to T
produces a spanning tree but is this minimal?
PRIMS ALGORTHIM: EXAMPLE
[A] [4][B] [5][7][C] [9][8][8][D] [5][5][5][8][E]
STARTING AT D
start at D: •cheapest Edge out costs 8 (pick any one of the cheapest) pick DB at random •cheapest from D or B to New vertex: BA costing 4 • cheapest from (A,B,D) to new: (AC, AE or BE) pick AC costing 5 • cheapest from (A,B,C,D) to E: (AE, BE, CE) all cost 5
so 8+5+5+4=22 same minimal value but DIFFERENT MST than kruskals
Kruskal’s notes:
*Kruskal’s algorithm always finds a cheapest spanning tree:
Suppose we have a tree T from kruskal’s. let T_m be a minimum spanning tree. Need to show w(T)=w(T_m) over edges of w(e_i) ≤w(e_i)≤…≤…
(i.e. transform T to T_m edge by Edge, by exchange principal ,and show each step is a minimal spanning tree)
Let T be a spanning tree of G and e=xy an edge not in T. Then: unique path P from x to y using only edges of T. if f is any edge in P then T’=T\f∪e is a spanning tree
ie can exchange edges in P for e.
*always add cheapest edge in T but not in T_m
in T∪T_m will be loops: BDE
…..
*always add cheapest edge in T but not in T_m
in T∪T_m will be loops: BDE
continued
notes on proof:
??
drawn diagrams deleting edges, adding to get loops either adding edges or making loops
Exchange principle:
Let T be a spanning tree of G, and e = xy an edge not in T.
Then:
Unique path P from x to y using only edges of T I If f any edge in P, then T0 = T \f ∪e a spanning tree i.e., can exchange edges in P for e.
Basic idea of proofs:
- Let T be spanning tree produced by algorithm I Let Tm be a minimal spanning tree
- Transform Tm to T edge by edge using exchange principle
- Show each step is a minimal spanning tree
Key: always add cheapest edge in T but not Tm.
cheapest spanning tree
2 greedy algorithms:
Kruskal: add cheapest edge that doesn’t make a loop
Prims: start at v, add cheapest edge to the new vertex
*not immediately clear they find MST but proof wont be on the exam
paths
- TSP- visiting every vertex and returning
- shortest paths
- longest paths
Directed graph
A directed graph is one where each edge has a chosen starting and ending point, usually indicated with arrows.
*walks on directed graphs can only follow that direction
Finding all minimal spanning trees
All edges have distinct weights:
All edges have the same weight:
A few edges have repeated weights:
All edges have distinct weights:
- Never have to make an arbitrary choice
- Unique minimal weight spanning tree
All edges have the same weight:
- Any spanning tree is minimal
- *Probably too many to write down
A few edges have repeated weights:
- Only a few places we have to make a choice
- Can find them with case by case analysis breaking ties
Dijkstra’s algorithm
Dijkstra’s algorithm for shortest path
cant have negative edge weights
Input: A weighted (possibly directed) graph G and starting vertex v ∈ G
output:
For every vertex w ∈ G a list of all shortest paths from v to w
Initialize:
From starting vertex v list every edge out of v as a potential shortest path to corresponding vertex w
Iterate:
- *Choose w with cheapest potential shortest path and make these paths permanent
- Update list of potential paths by adding edges out of w to the shortest paths to w and checking if they’re cheaper than known paths
Example graph from the 2008 Exam
Example graph from the 2008 Exam
Directed graph edges SB 7 SA 11 BA 3 AD 6 BD 10 DC 4 AC 10 CF 5 DF 9 CE 10 FE 3 ET 7 FT 11
marking route on graph S to T
writing:
possible paths
setting cheapest possible path as permanent
Step 1: possible paths from S SB 7**** chosen SA 11 Step 2: possible out of B,chosen SA 11 but SBA 10**** SBD 17
Step 3:possible out of A
SBAD 16**
SBAC 20 SBADC 20
SBADF 25
Step 4: out of C
SBADF *
SBACF *
SBADCF all cost 25***
S..CE= 20+10 = 30
Step 5: New record for S-E
S_{3 ways} FE 28**
S_{3 ways} FT 36
Step 6:
S..ET cost 28+7=35 less than 36
cheapest path: SB 7 SBA 10 SBAD 16 SBAC or SBADC 20
SF = 25 in 3 ways S_(3 ways) FE 28 S_ .. ET 35 …. SBACFET SBADFET SBADCFET
Add on:
Which edges if made a little longer would make the distance from S to T longer?
in example
General answer: Edges In all shortest paths:
these are:
always
SB, BA,FE,ET
sometimes: AC AD CE CF DF
Which edges if made a little shorter would make the distance from S to T shorter?
Any edge in any shortest path
eg both SB BA AC AD CE DF CF FE ET
NOTES ON DIJKSTRAS ALGORITHM:
Suppose we’re at the step were Dijkstra’s decides the cheapest path to w.
Then:
A cheaper path to w would have to go through a vertex u we haven’t found the cheapest path to yet.
But! Even getting to u costs more than our cheapest path to w.
An observation: Dijkstra’s algorithm depends on the edge weights being non-negative!
- optimal- if all we know is a weighted graph cant improve col
- not good In practical as every edge considered
- driving distances
- A* algorithm avoids searching London: Supposes we have an easy to calculate “heuristic distance” h(v,w), that is a lower bound for the actual distance d(v,w).
Finding the longest path
Scheduling a large problem with many parts For example, building a house.
- Some can be done at same time: (finishing interior rooms)
- Some need to be done in order: (foundation before walls)
- How early could whole project be finished?
Solution: longest path
* Encode tasks as edges in directed weighted graph
- Edge e follows edge f if task f requires task e
- Length of longest path is the shortest time to complete project
Building the directed graph from a list of tasks and dependencies can require a few tricks and won’t be tested
Longest path might not exist
A necessary assumption:
If the graph has a directed cycle, we could get an infinitely long path by repeating graph over and over again.
Graphs in scheduling applications are acyclic:
Otherwise we’d have a cycle of tasks that all depend on each other and we could never start the project!
Definition acyclic
A directed graph G is acyclic if it has no directed cycles.
The longest path algorithm
Initializing:
- Topologically sort the vertices of G
- List every edge of starting vertex as potential longest path
Iterate:
* Make the potential longest path to the first vertex w on the list permanent
- Update the list of potential longest paths adding edges out of w to longest paths to w and seeing if they create new longest paths
Ordering vertices / “topological sort”
If G is acyclic it’s easy to order the vertices of G so that if there’s an edge from v to w, then w comes after v
Note: the topological sort doesn’t have to be unique
eg UVWX UWVX directed edges UV UW VX WX
*if a graph has a directed cycle we cant order it
every tree has a topological ordering, as it has no cycles ie start from top and leaves at end
HOW TO:
We choose an unvisited vertex. Explore out until we cant or have already written down in ordering . Then we add that last vertex to the end of our ordering. Go back to prev vertex and explore out if made a choice, adding. Then we can select another unvisited node. Topological orderings aren’t unique.
Example graph from the 2008 Exam
Find all longest paths from S to T
Directed graph edges SB 7 SA 11 BA 3 AD 6 BD 10 DC 4 AC 10 CF 5 DF 9 CE 10 FE 3 ET 7 FT 11
Topological sort this graph:
Topological sort this graph:
SBADCFET
List every edge of starting vertex as potential longest path:
potential longest paths out of S:
SB 7*
SA 11
Iterate:
* Make the potential longest path to the first vertex w on the list (most expensive to next vertex) permanent
- Update the list of potential longest paths adding edges out of w to longest paths to w and seeing if they create new longest paths
potential longest path out of B:
SBA costs 10
but SA costs 11*
SBD 17
potential longest path out of A:
SAC 21
SAD 17*
and SBD 17*
potential longest path from D:
SADC 21*
SBDC 21*
(and SAC 21*)
SADF 26
SBDF 26
potential longest path from C: SADCE 31 SBDCE 31 SBDCF 26* SADCF 26* or SADF* SBDF*
potential longest path from F: S...FT 37 S...FE 29 But SBDCE 31* SADCE 31* potential longest path from E: S...ET 38* (as S..FT 37 less than 38)
So our longest paths are:
SBDCET
SADCET
SACET
DIAGRAM: of all possible longest edges in paths and including in order
arrows \_\_\_ \_\_\_\_ \_\_\_ / ↘/ ↘/ ↘ S→B→A→D→C→F→E→T \ ↗ \ ↗\ ↗ ------- ------ -----
longest path B-7 A-11 D-17 [from B=10+7, A=11+6] C-21 [from A=11+10, D=17+4] F-26 [from C =21+3, D=17+4] E-31 [from C=21+10, F= 26+3] T-38 [from E=31+7, F=26+11]
marking on diagram edges SB SA DF CE ET bold
AD—
AC—
BD—
all longest path end CET but 3 ways get to C
Add on:
Which edges if made a little longer would make the longest path from S to T longer?
Any edge in the largest paths
increase makes longest path longer
SA SB AC AD DC BD CE ET
Which edges if made a little shorter would make the longest path from S to T shorter?
THE EDGES IN ALL LONGEST PATHS
ET
CE
ie we are considering the longest path still so decrease all our options
The Travelling Salesperson Problem
A Travelling Salesperson wants to start from home, visit every city on their list, then return home for as cheaply as possible.
The Travelling Salesperson Problem, or TSP, is the following. Given a weighted graph G (usually G is the complete graph), what is the Hamiltonian cycle (i.e. closed walk visit every vertex) of cheapest weight
*we don’t tend to solve: only find bounds
TSP and Hamiltonian cycles
Let G be a graph n and vertex set V and edge set E. Suppose we want to determine whether G has a Hamiltonian cycle. Weight a complete graph on the vertex set V as follows:
make every edge in G have weight 1, and every edge not in G have slightly higher weight:
w(uv) =
{1 uv ∈ E
{1 + ε uv / ∈ E
Then, G has a Hamiltonian cycle if and only if there is a solution to the TSP with weight n.
*as we only used the edges in the original graph
upper bound for TSP
find the weight of any Hamiltonian cycle
*One greedy algorithm: nearest neighbour Keeping going to the closest city you haven’t been to
nearest neighbour
only nearest haven’t been to. ie only the nearest to the one we just visited, no going back to consider other choices.
lower bound for TSP
can’t use a cycle
Any Hamiltonian cycle has length greater than solution to TSP
(ie connect every vertex vs return to original vertex)
To find lower bound to TSP * Pick a vertex v ∈ G
- Find a minimum weight spanning tree T on G \v
ie remove this vertex. - Find the two cheapest edges e1 and e2 out of v, the vertex we removed
- w(T) + w(e1) + w(e2) is a lower bound:
ie its cheaper than any Hamiltonian cycle
Prove that w(T) + w(e) + w(f ) ≤ TSP
Suppose that C was the Hamiltonian cycle of minimum weight.
We split the C into two pieces: the two edges f1 and f2 adjacent to v, and the rest, which we’ll call R.
Edges adjacent to v: e1 and e2 were the two cheapest edges next to v, so: w(f1) + w(f2) ≥ w(e1) + w(e2)
The rest:
C a cycle, so R = C \v is a path
Paths are special cases of trees
R visits every vertex of G \v
Hence R a spanning tree and w(R) ≥ w(T)
(If C is a Hamiltonian cycle delete A and the 2 edges in C that visit A.
ill have a path through every vertex except A. C{A_{g,h}} is a spanning tree of G\A and so w(g,h) is bigger than w(t)
For the lower bound to be attainable
ie be a Hamiltonian cycle, we need
1) T has to be in path
2) the 2 cheapest edges out of A have to go to the two ends of the path
)
Example from the 2006 Exam The following table encodes distances between towns in km: [ A] [23][ B] [10 ][ 21][ C] [30][ 39][ 21][ D ] [57][ 45][ 48][ 45][ E] [68][ 63][ 59][ 47][ 24][ F] [75][ 67][ 66][ 54][ 24][ 11][ G]
Find lower and upper bounds to the TSP.
nearest neighbour for upper bound: start at A
AC 10 CB 21 BD 39 DE 45 EF 24 FG 11 GA 75