Molecular Phylogenetics Flashcards

1
Q
What do these mean?
Taxa
Clades
Branches
Nodes
Roots
A
Entities being compared
Groups of taxa sharing a common ancestor
Reflecting evolutionary change
Points where branches meet
Oldest point on the tree
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 4 aspects of a tree?

A

Topology (branching order)
Branch lengths (indication of genetic change)
Root (oldest point on tree)
Confidence (bootstraps/probabilities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What models of sequence evolution are there?

A

Jukes & Cantor Model: assumes all nucleotides equally frequent and all changes equally probable, K=-0.75ln(1-4d/3)
Problem: not all changes equally likely, some bases more likely and diff rates of substitution
Kimura 2-parameter model: Allows different rates of transitions and transversions, higher rate between C & T, K=-0.5ln[(1-2p-q).(1-sq)^0.5]
Tamura-Nei model: allows different rates of transitions (A G), & of transitions (C T), & of transversions, & allows unequal base composition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do rates vary in molecular evolution?

A

Rates vary among genomes - should always use sequences from the same genome to calculate distances
Rates vary among proteins - should always use same gene/protein to calculate distances, should also use same part for all species
Rates vary among lineages - rate constancy assumed by UPGMA not a safe assumption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the maximum parsimony method?

A

‘Cladistic’ method
Starts from a set of variable character states and aims to find tree with smallest number of character state changes
Only uses ‘informative’ sites
Makes an unrooted tree, and may be more than 1 equally maximally parsimonious trees
Not good estimates of branch lengths

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the UPGMA (unweighted pair-group method with arithmetic means)?

A

‘Phenetic method
Starts from a matrix of pairwise distances among taxa
Assumes perfect molecular clock
Proceeds by progressively clustering taxa with shortest distances
Doesn’t evaluate all possible trees
Produces tree rooted at midpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the Neighbour-Joining (NJ) method?

A

Starts from pairwise distance matrix
Minimum evolution tree (shortest total branch length)
Evaluate all possible trees or take a short cut
Start from a star tree and try all possible positions for a new branch, each time: calculate branch lengths, sum for total tree branch length, choose tree with smallest total length
Fast - good for large data sets
Good at recovering the true tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the maximum Likelihood method?

A

Need model of sequence evolution, need a criterion/set of criteria to choose between alternate trees, evaluate all possible trees
Allows complex models of sequence evolution
Formally evaluates different possible trees
Computer-intensive
For every possible tree consider probability: at each site in the alignment, of each possible nucleotide character state for ancestral nodes
Take product of all of those probabilities as the likelihood value for that tree
Choose tree with highest (log) likelihood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you do bootstrapping?

A

Construct a pseudo-replicate alignment:
- randomly sample sites from the real alignment
- sample with replacement
- until same length as real alignment
Make a tree using the same method
Repeat many times
Record how often each partition (= internal branch) occurs across pseudoreplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why use bootstrapping?

A

Estimate of how consistent the phylogenetic ‘signal’ is along the alignment
Longer branches likely to have higher values
Values around 75% (or higher) generally taken as ‘meaningful’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What problems can occur with phylogenetic trees?

A

Long branch attraction

Outgroups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is long branch attraction?

A

Unequal rates of evolution causes rapidly evolving lineages are inferred to be closely related, regardless of their true evolutionary relationships
Usually in maximum parsinomy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some examples of long branch attraction causing problems?

A

Herpes virus evolution: tend to co-evolve with hosts, genes evolve ~10 x faster than mammalian genes, occasionally acquire extra genes from host genome
Long branch attraction made it seem the origin of the BoHV-4 Bo17 gene not from buffalo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why are outgroups used?

A

Midpoint rooting - could fail with unequal rates of evolution
Outgroups useful to root trees
(All good phylogenetic methods produce unrooted trees)
An outgroup: Should be as close as possible to the other species, because a distant outgroup may not find the root of the other species (long branch attraction, or other problems)
But a very close outgroup may not be the outgroup?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly