Phylogeny Flashcards
What does a phylogenetic tree show?
Evolutionary relatedness between species based upon genetic similarities
What is implied when species are clustered together in a phylogenetic tree?
That they have a common ancestor (note: implication, not fact!)
What is the genetic distance?
The proportion of nucleotides that are different between sequences
How is genetic distance represtented in a phylogenetic tree?
Horizontal distance
Why does genetic distance between species stemming from a common ancestor increase over time? (4)
- Adaptation to immune system of hosts -> changing of epitopes
- Adaptation to therapies (resistance mechanisms)
- Random changes
- Bottleneck events
What does the bootstrap value represent?
The reliability of the topology of the tree
From which bootstrap value onwards is the reliability of the topology of the tree sufficient for typing?
70% or higher
At which bootstrap value is the reliability of the topology of the tree such that we can speak of perfect clustering?
100
Which variable do we need to know in order to estimate when a certain variant split off from the most recent common ancestor?
Rate of genetic change over time (often nucleotides/position/year)
What is a synonymous mutation?
Mutation that results in the same amino acid being present
What is a non-synonymous mutation?
Mutation that results in another amino acid being present
What is a transition?
A swap from purine->purine (A <–> G) or pyrimidine->pyrimidine (C <–> T)
What is a transversion?
A swap from purine->pyrimidine or pyrimidine->purine
What are the two purines?
A, G
What are the two pyrimidines?
C, T
What is the difference in biological effect between a transversion and a transition?
Transversions result in amino acid substitutions more often than transitions do
Which do occur more often: transversions or transitions?
Although there are more possibilities for transversion, the molecular mechanisms that generate transitions occur much more frequently, making them more common
Why is a higher GC-content associated with a higher melting temperature of genetic material?
CG-combinations have two hydrogen bonds, making them more stable than AT-combinations, which have one hydrogen bond
What is the p-distance?
The proportion of different nucleotides between two sequences
What does the horizontal distance in a phylogenetic tree represent? (2)
Because genetic distance is linearly proportional to time passed, it represents both:
1. Genetic distance to most recent common ancestor
2. Time since most recent common ancestor
What are the flaws of using p-distance to measure relatedness of sequences?
Does not recognize the difference in the biological effect of transitions (small) vs. transversions (high) -> it only expresses homology
What are the working assumptions for the Jukes Cantor (JC69) method to calculate genetic distance? (2) Are these assumptions correct?
- All nucleotides occur equally frequently
- Any nucleotide has a probability of 25% to be replaced by another nucleotide
Assumption 1 is incorrect -> in reality, this is never the case
Assumption 2 is incorrect -> the theoretical possibilities of transversions:transitions = 2:1, in reality, transitions occur more frequently
What does an unrooted tree show?
The relationship between organisms, without showing the common ancestor
What is the disadvantage of unrooted trees?
No root = no possibility to estimate a most recent common ancestor -> does not allow to talk about ancestor-descendant relationships
What does a rooted tree show?
Shows the last common ancestor of the groups in the tree -> allows to talk about ancestor-descendant relationships
What is an outgroup and why is it included in a phylogenetic tree?
An organism that is related, but not completely related -> it is a way of forcing distinction into the tree
True or false: genetic distance within a phylogenetic tree is additive
True
What are (examples of) three different methods to construct a phylogenetic tree?
- UPGMA
- Neighbour-joining (NJ)
- Maximum likelihood
What is the flaw of the UPGMA method?
Can only provide a correct result when genetic distances are equal, which is rarely the case
In which cases is neighbour-joing a good method? (3)
Situations in which a quick analysis is required, such as:
1. Preliminary analysis
2. Quick evaluations of contaminations in a lab
3. Typing
What is the advantage of a maximum likelihood tree?
It evaluates all possible tree topologies, which makes it more reliable than NJ trees
What is the disadvantage of maximum likelihood trees?
Takes a long time to be constructed because all possible trees have to be evaluated
Why are some amino acid substitutions more prevalent than others? (2)
- They are advantageous to the virus
- They do not disturb biological function and are not harmful
Conclusion: amino acid substitutions that are disadvantageous are selected against and are thus uncommon
What can BLAST be used for?
To identify biologically related DNA sequences based on comparable biochemical properties
When does BLAST consider sequences of nucleotides/amino acids to be similar?
Based on a similarity score of nucleotides/amino acids
When a threshold for similarity is met, BLAST considers these sequences to be similar
What is the effect of setting the threshold of similarity in BLAST higher/lower?
Threshold too low: slows down search -> BLAST identifies too many similar sequences
Threshold too high: may lead to missing relevant sequences
What does a similar tree topology for different genes of the same virus indicate?
That there is no recombination between genes
What is BEAST software used for?
Analysis of viral sequences
Which information can BEAST estimate when sample dates and sample locations are known? (5)
- Rates of evolution
- Location of evolution
- Date of evolution
- Most recent common ancestor
- Coalescent events = branching events
Do DNA or RNA viruses typically have a higher rate of evolution?
RNA-viruses
How is the rate of evolution expressed?
Nucleotide substitutions/position/year
What is a dead end cluster?
A cluster that does not evolve into another dominant cluster
What are the three methods to detect reassortment in phylogenetic analysis?
- Concatenated trees
- Assigning clades
- Tanglegrams
How do concatanated trees detect reassortment?
By using multiple marker genes of the same virus -> if there is a dissimilar tree topology for multiple genes of the same virus, this indicates reassortment
What is the weakness in using concatenated trees for detecting reassortment?
Segments with more mutation or larger segments affect the tree more
How does the method of assigning clades for detecting reassortment work?
Identifying unbroken lines of evolutionary descendence of genes
What do straight vs. crossed lines in tanglegrams indicate?
Straight line = similar tree topology -> no reassortment
Crossed line = dissimilar tree topology -> reassortment
What is a bottleneck event?
Reduction of genetic diversity due to environmental events
What is the effect of bottleneck events on the genetic diversity of segmented viruses?
Loss of genetic diversity in some or all of their segments -> causes a new antigenic cluster to become dominant
What is the effect of bottleneck events on the genetic diversity of non-segmented viruses?
Genome-wide genetic sweeps -> virus will be replaced by another variant of the virus
Which sites in the genome are under an especially high genetic pressure?
Sites that have to do with escaping immunity
The ratio of synonymous vs. non-synonymous substitutions can indicate whether it is advantageous for a virus to have a specific site mutate. What does a ratio of non-synonymous/synonymous of < 1.0 indicate?
Purifying selection -> virus wants to keep the site the same, mutations at the site are disadvantageous
The ratio of synonymous vs. non-synonymous substitutions can indicate whether it is advantageous for a virus to have a specific site mutate. What does a ratio of non-synonymous/synonymous of ~ 1.0 indicate?
Neutral evolution -> mutations at this site are neither advantageous nor disadvantageous
The ratio of synonymous vs. non-synonymous substitutions can indicate whether it is advantageous for a virus to have a specific site mutate. What does a ratio of non-synonymous/synonymous of > 1.0 indicate?
Positive slelection -> mutation at this site is beneficial for the virus
What does a ladder-like tree indicate?
Higher genetic pressures, in which variants constanly get swapped out for newer variants