Genetics of Adaptation Flashcards
What is intra-specific diversity?
Diversity within a species e.g. sexual dimorphism
what is inter-specific diversity?
Diversity between species
The four forces that can drive evolution
- Mutation
- Drift
- Migration
- selection
What is Adaptation?
A characteristic that enhances the survival or reproduction of organisms that bear it, relative to alternative character states.
Hardy-Weinberg equilibrium
p2 + 2pq + q2 = 1
What does HWE show?
- HWE gives a mathematical baseline of a non-evolving population to which evolving populations can be compared. A null model.
- Describes allele frequencies in a population from one generation to the next
Assumptions of HWE
- Infinite population (no drift)
- No mutations
- No selection
- Mendelian inheritance
- Random mating
- No migration
how is HWE disrupted?
by one of the evolutionary forces
1. mutation
2. genetic drift
3. migration
4. natural selection
Evolutionary force - mutation
Random
Only process that brings new variation
Evolutionary force - Genetic drift
Random changes in unselected allele frequency
Happens more in smaller populations
tends to lower heterozygosity
can cause isolation populations to diverge
Evolutionar forces - migration
Counteracts divergence due to drift
brings in new variation from previously isolated populations or rare hybridisation events
evolutionary forces - Natural selection
Fitness and adaptation focussed
Differential survival and/or reproduction of [classes of entities] that differ in one or more characteristics
Fitness
Probability of survival x average number of offspring
(combination of survival and reproduction)
Fitness (w)
The fittest: w = 1
Not so fit: w = 0.5
The most unfit: w = 0
The difference between w and 1 = the selection coefficient (s)
How do fitness (w) and the selection coefficient (s) differ?
Fitness and selection coefficient are the inverse of one another
E.g. if a genotype has a fitness (w) of 0.9 then s would be 0.1
(adds up to 1)
How do we know natural selection exists?
- correlations between trait and environment
- Responses to experimental change in the environment
- Correlations between trait and fitness component
- Signatures in the genome
Problems with detecting selection
Is the adaptation just a consequence of physics/chemistry?
Genetic drift can spread traits
Ancestral state (exaptation - something that’s already present due to other reasons but may also have adaptive features).
Selection might not cause any change
Selection might not be working at the individual level
Linkage - linkage disequilibrium (Alleles appearing together more often than you would expect). Hitchhiking allele
what is standing genetic variation?
The number of alternative alleles for a gene at a given locus in the population
measure of variation
‘The diversity of choices’
What maintains genetic diversity?
- Mutation
- sex
- Ploidy
- balancing selection
- Heterozygous advantage
- frequency- dependent selection
Where is the mutation?
Somatic Mutation: at the individual level
Germline mutation: the only mutations that can be heritable
Somatic mutation
At the individual level
Germline mutation
The only mutations that can be heritable
What are point mutations?
Substitution
insertion
deletion
inversion
What is a Synonymous mutation?
Silent mutations - have no effect on the amino acid
What are non-synonymous mutations?
a mutation that causes a change in the genetic code
Missense mutation
- Change in a single amino acid within a protein
- Can effect how a protein folds/reacts etc
- Non-conservative mutations can have a huge impact
nonsense mutation
stop mutation, is a change in DNA that causes a protein to terminate or end its translation earlier than expected
Frame shift
Changes the way the entire DNA sequence is read
Structural mutations
Happens at the scale of a whole region
Changes happen at the deletion, duplication, inversion, substitution, translocation
Inversion mutations
a section of DNA breaks away from a chromosome during the reproductive process and then reattaches to the chromosome in reversed order
Inversions limit recombination
Super gene formation
- genomic regions containing sets of tightly linked loci.
- Cause big polymorphism
- Gene complexes.
- A lot are caused by inversions
Male ruff bird morphs
- Independent male
- Satellite male
- Faeder (female mimic male)
Why are there morphs of male ruff birds?
- Inversion on chromosome 11
- Faeder and satellite males have inversion of ancestral gene
- Accumulation of this region has resulted in two different morphs
- Divergence between Faeder and satellites
Female morphs in Papilio polytes butterflies
- Inversion in a super gene called doublesex gene
- Two female morphs mimic other toxic species
- Sex limited so no recombination polymorphism is maintained
The doublesex gene
Transcription factor that controls somatic sex differentiation in a range of insects
results in highly differentiated genotypes
Why do rates of mutation vary?
Depends on:
- Type of mutation
- Genome location
- Species
- sex
Why do males have higher mutation rates than females?
Produce more gametes - more cel divisions
mutation rates are often sex biassed
sperm count across species differs
Organophosphate insecticides used to control Culex pipiens (mosquitos)
Carry west nile virus
strong selection pressure
for insecticide - be resistant or die
Target site resistance
what is target site resistance? (insecticide example)
mutation in the enzyme targeted by the insecticide
- a single substitution mutation alters shape of binding site
- insecticide can no longer bind
likely the resistance happened one (from a single base pair mutation) and spread via gene flow
Oceanic cricket example (mutation in adaptation)
Parasite fly locates host cricket via chirping sound
Mutation causing flat soundless wings rise
Male only mutation (X chromosome)
Happened twice in separate populations (curly & flat silent wings)
convergent evolution
Why are most new mutation deleterious?
Chance
Very few mutations have a positive effect on fitness
how does sex cause variation?
Independent assortment
Random fertilisation
Recombination
How does ploidy cause variation?
Recessive alleles are sheltered from selection (diploidy)
The rarer a recessive allele is, the greater the retention rate
Maintains alleles that are less favourable during current conditions but may be favourable when the environment changes
Balancing selection is?
Selection that maintains polymorphism
Heterozygous advantage (balancing selection)
Heterozygote is fitter than the homozygote
E.g. sickle cell trait and malaria resistant
Sickle cell trait provides malaria resistance
Maintaining variation
Frequency-dependent selection (balancing selection)
The fitness of an allele changes depending on how common it is
Positive frequency-dependent selection
Strength in numbers
Fitness and frequencies are positively correlated
runaway selection - get more and more fit
Negative frequency-dependent selection
The advantage of being rare
fitness and frequency are negatively correlated
fitness decreases when allele becomes more common
Example of negative frequency-dependent selection (cichlid cycles)
Lake cichlids right of left side mouth morph (determines which side they attack prey)
Prey become wary of side with most attacks
two morphs are adaptive
keeps population of each morph stable
What are the two phases of evolution?
- Within-species polymorphism - variation within species
- Between-species divergence - some of this variation is fixed which leads to divergence between species
- Two interconnected processes
- Divergence is due to substitutions
- Polymorphism is due to segregating variants
Timescale of divergence is longer than that of polymorphism
the neutral theory
- The fate of most mutations contributing to molecular diversity is determined by drift rather than selection
- i.e., the mutations have no selection cost or it is so weak compared to drift that they behave like neutral variants
Polymorphism
Difference between individuals of the same species
Divergence
Difference between individuals of different species
Molecular evolution
- Evolution is changes in allele frequencies over time
- A chromosome carries one possible allele at any given locus
- Mutation generates a new allele which can be inherited by its carriers descendants
- Each new allele starts as a mutation in a single individual
- Frequency of the allele can increase or decrease in each passing generation (due to genetic drift, selection etc)
Allele frequencies
- Can be caused by genetic drift
- Even beneficial mutations can be lost when rare
- Stochastic loss
- Polymorphism must happen before any allele can be fixed
- Polymorphism and divergence are linked
What would happen to the neutral theory if selection was happening?
The neutral theory would be rejected
Measuring how much variation within a natural population there is at an average locus
Neutral theory
Measure Single nucleotide polymorphisms (SNPs) between members of a population
The nucleotide diversity (π)
The total number of nucleotide differences per site between two DNA sequences in all possible pairs in the sample population.
The nucleotide diversity (π)
Equation
π = Ndiff / Npc x L
The nucleotide diversity (π)
Equation breakdown
- π = Ndiff / Npc x L
- Ndiff → the total number of pairwise differences for all possible comparisons
- Npc → Total number of pairwise comparisons Npc = n(n-1)/2 → n = sample size
- L → the length of the sequenced region
What does calculating the nucleotide diversity (π) show?
- π is calculated from neutral sites on a genome to determine what normal variation is when there is only genetic drift. Not under selection. just random variation.
- Species with higher spawning rates may have more variation due to many gametes being produced and lots of scope for mutations to arise.
The Wright-Fisher Model
- Makes explicit, testable predictions about patterns of polymorphism and divergence
- E.g. for detecting seleciton.
- The standard model of evolution
The Wright-Fisher Model assumptions
Describes the sampling of alleles in a population with:
- No selection
- No mutation
- No migration
- Non-overlapping generations
- Random mating between hermaphrodites.
Assumptions don’t work for most species
Effective Population Size (Ne)
Number that tells the strength of genetic drift within a population
How does Ne affect diversity levels?
- Loss of genetic variation by drift is faster with a smaller Ne
- Populations with smaller Ne tend to be less polymorphic
- Mutation rates affect diversity levels
- Higher mutations = higher diversity
- Larger Ne = lower loss of diversity
Ne and Drift strength
The wright-fisher model predicts that the expected level of diversity..
..at neutral sites is
E (Pi) = 4Neu
- E → expected value of a statistic
- u → mutation rate per site per generation in the neutral region
𝑁𝑒 = 𝜋/4𝑢 (rearranged from 𝐸(𝜋) = 4𝑁𝑒𝑢)
- 𝑁𝑒 = effective population size
- 𝜋 = nucleotide diversity
- 𝑢 = the mutation rate per site per generation
Calculating effect populaiton size (Ne)
𝑁𝑒 = 𝜋/4𝑢 (rearranged from 𝐸(𝜋) = 4𝑁𝑒𝑢)
- 𝑁𝑒 = effective population size
- 𝜋 = nucleotide diversity
- 𝑢 = the mutation rate per site per generation
Human evolutionary history
- Effect of genetic drift is considerable
- Genetic bottlneck
Genetic bottlenecks
- Rapid reduction in consensus size genetic diversity between generations.
- Variation from mutations takes time to accumulate
- Small amount of standing genetic variaiton after a bottleneck event
Consequences of genetic bottlenecks on african and non-african populations
- African populations haven’t experienced as many bottlenecks and so have a higher Ne (effective populaiton size)
- Non-African populations should have a lower Ne
- Nucleotide diversity (𝜋) is higher in african populations than non african populations
Deleterious variants
Mostly get purged or remain at low frequencies
Beneficial variants
Mostly increase and remain at high frequencies
Synonymous polymorphisms
Mutations in protein-coding regions that do no effect the amino acid sequence
* Most mutations are probably neutral
* Synonymous diversity (πS)
Non-synonymous polymorphisms
Mutations in protein-coding regions that change the amino acid sequence
- May lead to a change in fitness
- Non-synonymous diversity (πA)
Synonymous vs non-synonymous diversity (prediction)
- Synonymous diversity (πS)
- Non-synonymous diversity (πA)
- Prediction: πA < πS
Synonymous vs non-synonymous diversity
- Most new mutations are deleterious and will be subject to purifying selection
- Non-synonymous mutations are selected against
Human bottleneck consequences
- The further away from african populations to higher frequency of deleterious alleles
- Purifying selection has not been able to purge with the effective population size being smaller.
Sex chromosomes have different effective population sizes compared to autosomes, Why?
In a population with equal males and females the effective population size (Ne) of the X & Y chromosomes will be uneven
- NeX = ¾ NeA
- NeY = ¼ NeA
- Drift is stronger on those chromosomes
- Selection is weaker
Faster X effect
- Rate of evolution is faster on the X chromosomes
- Larger X effect
- X chromosome is disproportionately involved in speciation
- Weak purifying selection lead to degeneration of Y chromosome
Two phases of evolution
- Within-species polymorphism - variation within species
- Between-species divergence - some of this variation is fixed which leads to divergence between species
- Two interconnected processes
- Divergence is due to substitutions
Polymorphism is due to segregating variants
Molecular Evolution
- Changes in allele frequencies over time
- Each new allele starts as a mutation on a single chromosome in a single individual
- In a diploid population of size N the initial frequency of a new mutant = 1/(2N)
Fixation probability as a function of the fitness effect of the new mutation
Neutral = 1/(2N)
Beneficial > 1/(2N); more beneficial the bigger the increase
Deleterious < 1/(2N); more deleterious, the bigger the decrease
Rate of mutation should be?
The rate of neutral molecular evolution
- Equal to rate of substitution → population size shouldn’t matter
- The rate of accumulation of new substitutions per generation depends entirely on the neutral mutation rate and is independent of the population size.
Neutral theory predicts?
- A linear relationship between T (time in generations) and K (the expected number of substitutions per site between the two homologous DNA sequences from two species). (differences between 2 species DNA sequences)
- the more genetically distant two populations are from one another the more DNA differences are expected to accumulate
The molecular clock
- The neutral model predicts that the rate of molecular evolution should be constant over time (it depends only on the neutral mutation rate u)
- This implies a molecular clock that can be used to estimate times of divergence of taxa when palaeontological data is absent .
What does the neutral theory predict between T and K?
- Linear relationship
- T = time between generations
- K = the amount of divergence (amount of genetic differences between populations)
Measuring divergence (K)
Compare two homologous sequences and calculate the proportion of nucleotide sites that are different.
- D = Total number of differences
- L = Total number of sites considered
- K = D/L
Measuring divergence (K) (example)
K = D/L
Species 1: AAGTCTTACG
Species 2: ATGTCTTGCG
- D = 2
- L = 10 (number of nucleotides)
- K = 2/10 = 0.2
Estimated mutation rate of humans
1.18 x 10^-8 per site per generation
Parsimony principle
The explanation that involves the fewest changes (simplest)
Divergence of the mitochondrial DNA
- Divergence of mitochondrial is around 10x higher than rate of nuclear DNA
- probably due to high concentrations of mutagens
Estimating mutation rates: Pseudogenes
Investigating neutral divergence:
A pseudogene has been rendered non-functional by mutations that prevent its expression (e.g. premature stop codons)
Reasons for departures from a strict clock: Primate vs Rodent
- The rate of evolution along the primate lineage is ~9% slower than that along the rodent lineage.
- Rodents have many more generation in smaller time-frames than primates - more mutations.
The generation-time effect hypothesis
- Errors in DNA replication in germ-line cells is a major source of mutation
- Hypothesis predicts higher mutation rate per time in species with shorter generation time (e.g. mice relative to humans).
- Species with shorter generation length will undergo more germ-line cell divisions per unit of time and thus accumulate errors at a higher rate
The generation-time effect hypothesis (simple explanation)
- Shorter generation lengths lead to more mutations.
- Higher mutation rates (per unit time) lead to higher rates of evolution.
why are populaiton genetics important?
A way of telling if variation is due to selection or genetic drift
What does positive selection do to variation within a genome?
Positive selection affects nearby genomic regions
What is genetic hitchhiking?
- When an allele increases in frequency due to a nearby allele that is under seletion experiencing a selective sweep (all surrounding alleles will be pulled along).
- Occurs even if ‘pulled along’ alleles don’t have any fitness benefit
How does strong selection effect genetic variation?
Strong selection reduces genetic variation around gene
What is haplotype homozygosity?
The probability of selecting two identical haplotypes at random from a population
What is increased haplotype homozygotsity?
- After a selective sweep every individual in the population will be homozygous for a particular haplotype
- Haplotype will be at high frequency
- Increased linkage disequilibrium
What is linkage disequilibrium?
- Positions occuring together more often than expected by chance
- High correlations between positions on a genome
*
What are SNPs?
Single Nucleotide Polymorphisms
The Site frquency spectrum (SFS)
- Measures how many SNPs are present in how many individuals
- Always expected to have highest number at singletons (SNP only occuring once).
- Frequency tails off as position is present in more individuals
Tajima’s D (D)
- Under neutrality D value will be 0
- Statistically significant departures from 0 suggest the action of other evolutionary forces
- D compares the relative abundance of intermediate and low frequency variants
Outcomes of Tajima’s D
Relative to expected under the neutral model:
* If there is an excess of low-frequency variants then D < 0
* If there is an excess of intermediate-frequency variants then D > 0
Ancestral and derived variants
- Ancestral and derived alleles can be inferred by using data from an outgroup species (lies just outside of the phylogeny of the considered groups).
- Under positive selction there will be an increase in high frquency (derived) variants
Fay and Wu’s (H)
Under neutral evolution:
- Fay and Wu’s H should be about 0
- H would be negative when there is an excess of high-frequency variants
Why do we need multiple tests for selection in a population?
- Tajima’s D and Fay&Wu’s H are based on a null model.
- A rejection means the null model does not hold
- Other evolutionary forces can lead to departures
- These are flase positives with respect to detecting selection