Measuring Evolution, Patterns and Models of Sequence Change and The Molecular Clock Flashcards
Evolutionary Genetics
Broader study of how molecular and population genetics phenomena bring about long-term evolutionary change, including speciation and adaption
Population Genentics
Study of the genetic composition of biological populations and changes to this composition
Homology
Similarity due to common descent
Problems relying on morphology for reconstructing evolution 3
- Convergent evolution - independent evolution of shared desired traits that were not present in the LCA
- Character reversal - a species may lose the derived trait and revert back to the ancestral form - fleas lost their wings
- Erratic rates of morphological evolution
When was DNA identified as a molecule of inheritence?
1940-50s
When were tools developed to manipulate and sequence DNA to study genetic variation?
1960-70s
What was compared first at a molecular level
Proteins then DNA
Molecular homology
Molecular similarity due to common descent
How is homology ascertained?
Sequence similarity
Hamming distance or degree of divergence
The proportion of differences (n/N) for two sequences of length N that differ at n sites
Analogous sequences
Sequences that are similar but not homologous, due to chance or due to reoccurring evolutionary processes
All observable differences between homologous sequences are due to
Mutation
What is the base-substitution rate across the entire genome in humans?
1 mutation every 10^8 base pairs per generation
The Neo-Darwinian Model (Panselectionism) 4 - false
- selection - strongest force in evolution and drives substitution events
- Mutation - ultimate source of genetic variation but only plays a minor role in evolution
- Polymorphisms - mainly maintained by balancing selection
- Genetic drift - mostly irrelevant
The Neutral theory of evolution 4
- The majority of new mutations are neutral or deleterious
- Neutral alleles have no impact on an organisms fitness and will change in frequency by genetic drift alone
- Most substitution events observed occurred by drift, not selection
- Negative selection also plays a powerful but silent role, removing deleterious mutations and working to keep the status quo
What does the rate of substitution in a population equal to?
The number of new mutations x probability of fixation
What is the rate of substitution for neutral mutations independent of?
Population size
Jukes and Cantor’s one-parameter model
All substitutions occur with equal probability - there is no bias
3alpha = rate of change of one nucleotide to any other
1-3alpha = probability that nucleotide stays the same
Kimura’s two-parameter model
Assumes like-to-like changes are more possible
Transition = pyrimidine to pyrimidine or purine to purine
Transversion = pyrimidine to purine
In vertebrates, how many times more are transitions observed compared to transverions
Twice
In mDNA, how many times more are transitions observed than transversions
20
What does it probably mean when we observe a high degree of divergence between two sequences
Most likely the same nucleotide has undergone multiple substitutions
The simple models of sequences variation assume the probability of substitution is the same across all sites in the sequence of interest. Name two violations to this
- The probability of mutation occurring can vary across gene or genome regions depending on base pair composition and other factors (chromatin organisation)
- The probability of fixation can very across gene or genome regions due to differing strengths of purifying selection
Discuss something that causes variable mutation rates within genomes
CpG dinucleotides which are susceptible to methylation - this means they are easily deaminated to give thymine from methylcytosine
The rate of transition substitutions will be higher in sequences with a lot of CG dinucleotides
Discuss the impact of purifying selection
Regions under purifying selection evolve slower. Sequences of functional importance are more resistant to substitution
The slowest evolving regions of the genome are protein coding sequences and their regulatory regions
Non-synonymous mutation
Nucleotide change that alters the AA sequence
Missense mutation
Nucleotide substitution that results in an AA change
Nonsense mutation
Nucleotide substitution that results in a premature stop codon
Name two Non-synonymous muutations
Missense and nonsense
Synonymous mutations
Nucleotide change that does not alter amino acid sequence
Are nucleotide changes at the first codon position synonymous?
Sometimes
Are nucleotide changes at the 2nd codon position synonymous
Never- they always result AA change, apart from stop codons
Are nucleotide changes at the 3rd codon position synonymous?
Often
Where are substitutions rates lowest?
Non-degenerative sites - 2nd codons
Where are substitutions rates intermediate?
Two-fold degenerative sites - 1st codon
Where are substitution rates highest?
Fourfold degenerative sites - 3rd codon
What is the effect of an AA substition?
80% = deleterious
20% = neural and drive molecular clock
~0% = advantageous
Whether an AA substitution is deleterious depends on two things:
- The biochemical properties of the new AA- how similar is it to the AA it replaced
- The level of functional constraint - how necessary is this specific AA for the function of the protein
What is the biggest AA
Tryptophan
Are replacements by similar AAs observed more in nature?
Yes
Example of a protein with functional constraint - Hae
Haemoglobin - AA sequence that forms the Haeme pocket is highly conserved, the remainder of the protein only constrained to be hydrophollic
Example of a protein with functional constraint -H4
Histone 4 - Two copies of H4 required in the histone octamer - almost the whole protein is highly conserved. There are 55 DNA differences in humans and wheat but only two AA differences
Example of a protein without functional constrain Fi
Cleaved from fibrinogen to activate blood clot formation. Virtually every AA is acceptable at each position as long as it doesn’t hinder cleavage
Name one of the fastest evolving proteins
Fibrinopeptides
Example of a disease that occurs when polymorphism are observed at highly constrains positions
Gaucher disease
What should we compare when detecting selection?
Compare Ka to Ks
Ka = number of non-synonymous substitutions per non-synonymous site
Ks = number of synonymous substitutions per synonymous site
Values for detecting selection
Ka/Ks < 1 = Non-synonymous substitutions are rare relative to the background mutation rate - suggesting purifying selection
Ka/Ks > 1 = Non-synonymous substitutions are much more frequent than background mutation rate - suggesting positive selection
Ka/Ks ~ 1 = Non-synonymous substitutions occur at the same rate as mutation, possibly neutral selection
What is the Ka/Ks for most mammalian genes
Ka/Ks < 1
What can Ka/Ks >1 indicate
Adaptive evolution such as FOXP2 language gene
Evolution of non-coding regions
Slowest - nondegenerate sites
Intermediate - 5’ flanking regions
Fast - Fourfold degenerate sites, introns and the 3’ flanking region
Fastest - Pseudogenes - best proxy for neutral evolution
What is a nondegenerate site
any mutation at this position results in amino acid substitution
Conserved enhancer region HACNS1
HACNS1 increases reporter gene expression in the human forearm, handplate, anterior wrist and thumb and corresponding leg regions
Morphological changes in the hands and feet were vital in human evolution - human gain-of-function
Who proposed the Molecular Clock hypothesis?
Zuckerkandl and Pauling
What is the Molecular Clock Hypothesis?
For any given protein, the rate of evolution is constant over time and across all lineages, as long as it retains its original function
Name two applications of the Molecular Clock
- Relative divergence times between species
- Evolutionary relationships between species
2 ways to find absolute divergence times used to calibrate the molecular clock
- Radiometric dating methods
- Historical dates
What is the upper limit of aDNA preservation?
2.6 million years - before that the earth would have been too warm for permafrost which is the best environment for preserving very old DNA
Calculating the rate of substitution (r)
r = K/2T - K is number of subs per site
T= K/2r
What is an internal test for evolutionary rate we use to test the molecular clock is correct when there’s an outlier?
The relative rate test
The relative rate test
.A direct internal test of the clock requiring no external data
.Use a third species (C) you know to have branched off earlier than the A and B split.
.Measure the molecular distance from A to C and B to C (number of substitutions)
.Test if the distance from A to ancestor is the same as B to ancestor
.If so, the molecular clock holds and the rate of substitution is the same
The molecular clocks holds if (formula)
Dac-Dbc=0
Causes of variation in the mutation rate
- Mutagen exposure
- Fidelity of DNA repair and replication
- Generation times
Causes of variation in fixation rate
- Selection - functional constraints can differ across species
- Population size - smaller populations have stronger drift and weaker selection. Larger populations have weaker drift and stronger selection
Metabolic Rate Hypothesis
smaller-bodied vertebrates generate higher levels of mutagenic oxygen radicals than larger vertebrates - faster in warm-blooded animals