L1 Phylogenetics Flashcards
Phenetics
Classify organisms on how similar they are
Linnean Taxonomy
classifying organisms into various ranks e.g. phylum, kingdom, class, order, genus, species
Cladists
See OneNote diagram
Classifying organisms on their evolutionary history, “shared derived characters”
- share the same evolutionary derived characteristics that the ancestral species does not have
Molecular Phylogenetics
- protein electrophoresis
- DNA:DNA hybridisation
- Sequences
How is phylogeny determined?
- identify homologous characters
- homology: derived from a common ancestor
Alignments and homology
See OneNote
- Asserting homology based on alignment
- the alignment problem
Taxa
Species
Node
point at the end of the tree or where the tree branches
Clade
grouping e.g. mammals grouped together
Phylogenetic trees
depict the evolutionary history of the taxa
Fundamental properties of Phylogenetic trees
- network without cycles, usually bifurcating
- polytomies, star phylogenies indicate failures to resolve the nodes into bifurcations, data set cannot resolve the correct order or the species has split into three
Cladogram
- just the clades, topology
Ultrametric tree
- has a root
- terminal nodes align
- time axis included
Phylogram
- branch length proportional to distance/number of changes
Topology
Grouping of tree
Trees can be drawn in different ways to represent the same thing as long as the topology is the same
Root
See OneNote diagram
Rooting affects the interpretation of the tree
Deciding where the root should be:
- Assume out-root species to decide where to root the tree from
- Assert a molecular clock, assume rate of change e.g. amino acid/nucleotide changes, most divergent species would be the root
Parsimony Principle
the one that takes the fewest steps is the most likely to have been the real scenario
Are all variable sites equally useful in drawing the tree?
See OneNote
- parsimony informative
- singleton site
Homoplasy
changing twice to go back to the same state
Small parsimony problem
for a given tree what is the minimal number of steps required to explain the data
Large parsimony problem
which of all the possible trees has the smallest minimum number of steps?
Possible number of tree topologies
N = number of taxa # of tree topologies = 2N-5
Searching tree space
- branch and bound = exact method, will find the shortest tree
- heuristic method = branch swapping
Nearest neighbour interchange
See OneNote
Subtree pruning and regrafting
See OneNote
- all possible subtree removals and reattachment points are evaluated but the cut point is the reattachment point
Tree bisection and reconnection
See OneNote
- all possible bisections and reattachment points are evaluated
- cut point is not necessarily the reattachment point
What do you do when you have multiple trees that are equally short?
- build a consensus tree
Strict consensus
See OneNote
Record the internal branches that are seen in all four trees, the other we collapse into a point of uncertainty - nodes collapsed to polytomy
70% majority rule consensus
See OneNote
If a particular branch/clade is in at least 70% of the trees then it is represented in the consensus tree
Distance methods
- lose info by reducing the data set to the distance matrix
- fast
Distance Matrix
See OneNote
Bootstrap
See OneNote diagram
- Sub-sampling your data
- Sample columns randomly to put into new data set to create pseudo-data set
- generate a consensus of bootstrapped datasets
Assumptions
- sites evolve independently of each other
- all changes are equally likely
BUT transitions usually occur more often than transversions
Parsimony Informative Site
A site is parsimony-informative if it contains at least two types of nucleotides (or amino acids), and at least two of them occur with a minimum frequency of two.