Virus Evolution Flashcards
What are branch lengths an estimate of?
Of the time since species diverged
What does an unrooted tree mean?
They are related by dont know where common ancestors sit
What is the root
The oldest point in the tree i.e. the common ancestor of all the species in the tree
What are the two ways to root a tree?
- Use a known out group - an out group is a species or group of species that diverged first
- Midpoint rooting - assume that all species are evolving at the same rate and put the root in the middle of the tree
What is a node?
Where two branches join
What is the tip?
The end of the branch
What is a Clade?
Groups e..g A, B and D all share a common ancestor, could say they form a clade
If there are 4 sequences how many unrooted trees are there?
3 unrooted trees
If there are 4 sequences and 3 unrooted trees, how many branches and therefore rooted trees are there?
5 branches, 15 rooted trees. The root could be on any of the branches
If there are 5 sequences how many rooted and unrooted trees are there?
15 unrooted trees
105 rooted trees
How do distance matrices work?
Just count the number of differences between two sequences then convert to a percentage
Go through creating a tree from distance matrix methods
What does minimum evolution distance matrix methods use?
Simultaneous equations
What does maximum parsimony use?
Informative sites
What is an informative site?
When there are at least two different nucleotides at the site, and each of which is represented in at least two of the sequences under study
Why aren’t informative sites very accurate?
Not taking into account some evolutionary changes, only looking at informative sites
What is bootstrapping?
Produce a new sequence of alignment of the same length as the real alignment , by random sampling from the sites in the real alignment
Make a phylogenetic tree from this alignment
Repeat steps 1 and 2 many times
Record how often each partition in the ‘real’ tree occurs among the bootstrapped trees
Why is multiple hits important?
Models fo sequence evolution attempt to correct for multiple hits i.e. more than on substitution occurring at the same site over evolutionary time
If we sample at time 4 and time 1 and it has changed from A to T then might say it has only changed once but at time 3 and time 2 it may have been G and C so has changed 3 times
Are transitions or trans versions more common?
Transitions more frequent than trans versions usually
What is A to G called (purine to purine)
Transition
What is A to C called (purine to pyramidine)?
Transversion
What does JC69 model assume?
One rate for transversion and transitions
What does K90 model assume?
Rates of transversion and transitions are different
What is the GTR model?
General time reversible - six classes of substitutions, base frequencies vary
What is the F81 model?
All substitutions are equal, base frequencies can vary
What is the K2P model?
Transitions and trans versions have different substitution rates, base frequencies are assumed equal
What is the HKY85 model?
Trnaisiotns and trans versions have different substitution rates, base frequencies can very
What is a molecular clock?
Rate of evolution of a given protein remains roughly the same over time
What does a molecular clock not mean?
Does not mean that -
All proteins evolve at the same rate
All sites within a protein evolve at the same rate
What is a synonymous change?
Silence change, doesn’t change underlying amino acid
What is ds?
The rate of synonymous (silent) substitutions
It reflects the mutation rate and is similar for different genes in the same virus
What is dN?
The rate of non synonymous amino acid replacing substitutions
Are dN or dS values lower?
DN values are lower because of constraints on proteins. DN values vary between proteins because different proteins have constrained to different extents
How do you work out the substitution rate?
= slope so molecular distance/time