Phylogenetics Flashcards
what is phylogenetics
evolutionary history of species through the construction of trees/phylogenies
what is phylogenetic inference
the process behind the construction of a tree
what is taxonomy
the process of biological classification of organisms based on shared characteristics
what is the order of taxonomic ranks
domain, kingdom, phylum, class, order, family, genus, species
what do branch lengths indicate
genetic change - longer = more change/divergence
what do nodes represent
sequences or hypothetical sequences at various points in evolutionary history
what do branches represent
the path of transmission of genetic information from one generation to the next
what is tree topology
the structure of branches, leaves and nodes in the tree
what is a cladogram
not based on sequence alignment
represent general taxa relatedness not phylogenetic relationships
what is an ultrametric diagram
branch lengths represent the evolutionary time between the corresponding species
what is a phylogram
branch lengths are proportional to the amount of divergence
what are orthologous genes
same function different species
what are paralogous genes
genes evolved different functions
what is a phylogenetic marker
a representative gene providing phylogenetic info about the relatedness among taxa - present in all organisms
what are the best markers to infer species phylogenies
single copy housekeeping and orthologous genes
what is an outgroup
distantly related organism that serves as a reference group
what is an ingroup
the organism under investigation
what are the two methods of rooting a tree
rooting by outgroup - fall outside the ingroup
rooting by midpoint distance - midway point between two most distant taxa
what are the 2 tree file formats
Newick and Nexus
what is newick format
standard format - brackets and commas
what are the two main approaches in tree building
distance based methods and character based methods
what are the two distance based methods
Neighbour joining and UPGMA
what are the two character based methods
maximum parsimony and maximum likelihood
what is a distance matrix
a square table containing distances between pairs of elements in a dataset
what does a distance matrix express
dissimilarities between objects - different phenotypic characteristics and substitutions
what is UPGMA
Simplest method - assumes evolutionary rate is the same for all
what is neighbour joining suited for
datasets comprising lineages with largely varying rates of evolution
what are the advantages of neighbor joining
fast and suited for large datasets
allow lineages with largely different branch lenths
what are the disadvantages to neighbour joining
returns only one possible tree
depends on model of evolution used
what is maximum parsimony
minimises the total number of evolutionary steps required
advantages to maximum parsimony
simple, logical
used on molecular and non molecular data
provides tree hypothesis of character evolution
what are the disadvantages of maximum parsimony
not statistically consistent
provides reliable results only if data is not affected by homoplasy
advantages of maximum likelihood
consistent and reliable
used on molecular and non molecular data
provides tree and hypothesis of character evolution
advantages of maximum likelihood
consistent and reliable
used on molecular and non molecular data
provides tree and hypothesis of character evolution
disadvantages of maximum likelihood
not simple and intuitive
computationally expensive
reliable results only if data is not affected by homoplasy
short comings of sequence alignment
heuristic methods are only an estimate
optimal alignment is not always homologous
alignments require human intervention
hierarchically aligning pairs generates biases
what are problems associated with assessing tree reliability
long branch attraction and lateral gene transfer
what is multi locus sequence typing
strain typing system that focuses on conserved housekeeping genes