Phylogenetic tree Flashcards
Phylogenetic tree
All species/life come from a single root. roots are the common ancestors
unrooted: don’t know the common ancestor
given sequences and their genetic distance we build a tree
Not unique, there are equivalent trees. if rotate 2 leaf on the same level they are mathematically different but biologically equivalent
Branch can be weighted or not to indicate the genetic distance. Length can represent distance. You can add up the distance of the branches to know the distance between 2.
Can use tree to investigate the spread of a virus
Constructing Phylogenetic Trees
- Distance Based Method: UPGMA and neighbor joining
- parsimony based method: maximum parsimony. tries to do the simplest tree. parsimony is less is better
- character-based method: maximum likelihood, bayesian inference (features)
Inferring trees
Character: from morpholy it is qualitative and can lead to mistakes. non numerical: has/ has not
binary matrix of characters x species to show which specie has what. calculate distance in terms of shared characteristics (this does not give the genetic distance…)
Distance or similarity: it is quantitative and compare sequence. align sequence and calculate genetic distance with the application of corrections Jukes Cantor or Kimura !!
Finding Branch Lengths with known structure
Use the 3 point formula
(can be seen in appendix of Neighbor Joining Algorithm)
Use the additive distance property.
Must know the tree structure !
You know the distance between the vertices but not the edges length to common ancestors
Add a node to each intersection
Write formulas of known distances in terms of the edges name. Then with this isolate the edges and see which known distance you need to sum/substract to find the length of the edge
Use 3 distance and everything is divided by 2.
Finding branch length with unknown structure
UPGMA or Neighbor-Joining
UPGMA
Finding branch length with unknown structure
Unweighted pair group method with arithmetic mean, unweighted: all pair contribute equally, pair: groups are combined by 2, arithmetic mean because pairwise distance is the mean
distance from root to all leafs is the same: ultrametric
there can be noise in the distance matrix (real tree is ultra metrix but not measurements)
find the pair with the smallest value in the distance matrix, define a new ancestor between the 2 (branches have of the 2 to the ancestor is length/2), compute distance between this ancestors and all the others by doing the average of the 2 elements to the others -> new distance matrix.
Then find the closest pair in the distance matrix and repeat. If a pair is done between a previous pair and a single element, need to do the average with the 3 distance !
The new matrices will becomes smaller and smaller as pairs are made
distance matrix: elements (or pair) x elements (or pair)
Weakness: molecular clock, all species evolve at the same rate although some evolve faster than others
Can use neighbor joining instead !
Neighbor joining
Finding branch length with unknown structure
4 point rule
Guarentee correct tree if the distance are additive. Can be good even if not
correct input distance matrix -> correct output tree
Use distance matrix: it is symmetric -> can mirror it
Take appendix to see formulas.
For all elements in the matrix compute R (check formula)
then for each pair compute M (check formula: 4 point rule)
Do matrix with the computed M scores
Smallest values become neighbors !
Do a new matrix and use 3 point formula to find the rest (create an ancestor etc)