Genetic Drift Flashcards

1
Q

What is the definition of genetic drift?

A
  • Genetic drift is a kind of random sampling of alleles entering the next generation
  • Drift acts as a dispersive forces that removes variation
  • Any population of finite size will be subject to genetic drift
  • Can be thought of as ‘accidents of sampling’ - which influence which alleles make it into the next generation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name 3 sources of randomness that contribute to genetic drift

A
  • Which allele in gametes come together by chance - e.g., for a heterozygous individual there is a 50:50 chance as to which alleles ends up in the fertilised zygote
  • Variation in chance of survival
  • Variation in chance of reproductive success
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What effect does drift have on variants?

A
  • Rare variants are easily lost due to chance events
  • Common variants are less sensitive to chance events
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give a famous example of drift in humans

A

Blood groups:
- Allele B of ABO, N allele of MN and Rh- allele are absent in Polynesia
- Alleles lost due to Founder effects during colonisation of islands across pacific by small groups
- Ioannidis et al., 2021 - Nature 597

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does the strength of drift change dependent on population size?

A
  • Strength of drift is stronger on small population sizes (faster loss of variation)
  • Over time, genetic drift leads to a decline in genetic variation due to a fixation/loss of alleles
  • Larger the population, the longer this takes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Wright-Fisher model?

A
  • Drift effects all DNA - whether under selection or not
  • Wright-Fisher model describes a population evolving from drift alone (no selection)
  • Is used as null scenario for testing patterns of genetic variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What assumptions are made for the Wright-Fisher model?

A
  • Non-overlapping generations
  • Constant size population (N individuals, 2N lineages)
  • Random union of gametes (‘random mating’) - each child has 2 parents
  • Sexual reproduction with all individuals hermaphrodite and able to self fertilise
  • Poisson distribution for reproductive success
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the effective population size (Ne)?

A

Ne is the size of the Wright-Fisher population equivalent to the real population being studied
- Is unlikely that a real population will conform exactly to the assumptions of the Wright-Fisher model
- However, these populations behave in a similar way to Wright-Fisher populations but with reduced population sizes
- Ne is always smaller than the real population size - N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does N vs Ne change dependent on pop size?

A
  • Drift is stronger in small pop - so genetic variation and Ne is lower in fluctuating pops compared to constant size populations with same max size (N)
  • Implies individuals from populations with smaller Ne more likely share a common ancestor in the recent past
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give an example how variation in mating systems can cause deviation from Wright-Fisher assumptions?

A

E.g., elephant seals
- Have highly polygynous mating systems - small number of males monopolise matings with a large number of females
- This leads to a larger variation in reproductive success between individuals
- So deviates from Wright-Fisher - can have consequences for the expected amount of genetic variation in the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 4 key features of genetic drift?

A
  • Random - unpredictable changes in allele frequencies between generations
  • Dispersive force - reduces variation in populations - causes allele frequencies to diverge
  • Neutral - all alleles influenced in same way
  • Related inversely to Ne - drift stronger in small populations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the probabilities of fixation?

A
  • 1/2N for specific allele copy
  • = frequency in population for particular allelic variant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What different ways can you predict the expected amount of genetic variation in neutrally evolving populations (drift in constant sized populations)

A
  • Wright-Fisher model and ‘forward in time’ perspective of genetic drift
  • Neutral theory and infinite alleles model
  • Mutation-drift equilibrium
  • Molecular clocks
  • Coalescent theory - ‘backwards in time’ perspective of genetic drift
  • Gene genealogies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the decay of heterozygosity?

A
  • Heterozygosity tending to 0 over time
  • Tends to 0 faster with a smaller N (pop size)
  • Ht = H0(1 - 1/2N)^t
  • Decay is geometric
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a population bottleneck and what effect does it have on genetic variation?

A

Is a sharp reduction in population due to an event - e.g., an earthquake/flood
- Pop size and genetic variation drops
- Pop size recovers fast
- Genetic variation recovers more slowly than population size - as only way to gain variation is through mutation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the infinite alleles (or sites - when referring to sequence variation) model?

A
  • Is the case where each mutation is to a novel state
  • In this case, under neutrality, a large number of alleles can be maintained in large populations
  • However, when heterozygotes are the fittest genotype, a ‘genetic load’ is created due to the existence of homozygotes for less fit alleles
  • Crow and Kimura - showed this creates an upper limit for the number of alleles - since selective advantage of fitter alleles is balanced out by the genetic load
  • This limit appeared inconsistent with high levels of variation seen in protein variation of Drosophila - led to Kimura suggesting most mutations had to be neutral
17
Q

What is the Nearly Neutral theory?

A

The idea that: In large pops with short generation times, noncoding DNA evolves faster while protein evolution is retarded by selection - which is more significant than drift for large pops
- Tomoko Ohta

18
Q

Explain the Mutation-Drift balance?

A
  • Mutation inputs new alleles into population
  • Drift removes alleles from population
  • Therefore, in neutrally evolving population, the amount of diversity will move to an equilibrium value - the magnitude of which depends on the balance of the 2 processes
  • Larger populations are more likely to mutate and are less sensitive to drift - so should have greater equilibrium levels of variation than small populations in the neutral case
18
Q

What is theta?

A

Population mutation parameter:
- Key parameter needed to estimate the level of genetic variation under neutral model
- Theta = 4Nu

18
Q

What can you use to predict the amount of genetic variation that should be present in a population?

A

Mutation-Drift balance in the Wright-Fisher model
- Drift - decreases diversity (1/2N)
- Mutation increases diversity (2Nu) - u = mutation rate
- From infinite alleles model use 4Nu
- 4Nu = theta

19
Q

What is the neutral theory of evolution?

A
  • Mutation = new allele
  • What is the probability that this new allele will become fixed?
  • Mutation rate = mu (u) - probability of a new allele = 2Nu
  • Probability of fixation of an allele = 1/2N
  • Probability of a new allele fixing = 2Nu x 1/2N = u
20
Q

Describe the molecular clock with its parameters

A

The hypothesis that DNA and protein sequences evolve at a constant rate over time and in different organisms
- p = rate of evolution (accumulation of mutations fixed between species)
- p = u - since we saw that the probability of fixation is equal to mutation rate
- For T1 in Species A - mutations are not substitutions but polymorphisms within species (transient entities)
- The number of mutations fixed between two species along one branch: T2u
- i.e. in neutral case, the expected number of mutations /genetic diversity along a branch is proportional to the time that separates them - so implies genetic variation is accumulating in a clock-like way where the ticks on the clock relate to the magnitude of the mutation rate

21
Q

What is coalesence theory and how does it differ from the Wright-Fisher model?

A
  • Alternative way of looking at drift - looking backwards in time
  • Works out the time to the most recent common ancestor (TMRCA)
  • Follow haplotypes back in time - seeing them ‘merge’ as they lose unique mutations
  • Wright-Fisher model looks forward in time to predict variation in the future - but has limitations: under genetic drift, we cannot predict in which lineage/allele this will persist in future - and makes it difficult to understand what may have happened in the past. And, from an imperical standpoint - we can only collect samples from back in time - might be interested in projecting histories back in time
22
Q

Why is coalesence important?

A
  • Real world data - we only have access to contempary sequences or alleles, which form the tips of the genealogy - cant see full genealogy for every generation in past
  • Coalescent allows us to reconstruct the history of surviving lineages and make inferences about the evolutionary processes which influenced them
  • The coalescent provides important framework for working with sequence and other genetic data
23
Q

Define the coalescent and what can you calculate?

A

The probability of any haplotype pair coalescing in the next (previous) generation

Can calculate:
- The probability of coalescence at a generation t in the past
- Mean time for a coalescence event to occur
- Time for all haplotypes to coalesce into a single lineage - Time to Most Recent Common Ancestor (TMRCA)

24
Q

What assumptions need to be made for deriving a neutral coalescent model

A
  • Lineages coalesce independently
  • Coalescence is rare - no more than a single coalescent event per generation
25
Q

How can you draw a coalescent genealogy?

A
  • Go back T generations - combine two lineages at random - decrease k by 1 - stop if k=1, k = sequences
  • If k=1, then all the lineages meet back at a common ancestor
  • T(MRCA) = 4N(1 - 1/k)
  • If there are a large number of lineages (k is high), then the coalescence time is ~4N - same as WF model
  • However if only two lineages, avg coalescence time is 2N - i.e. half of total coalescence time is taken up by. the last coalescence event
26
Q

What are the features of neutral coalescent trees?

A
  • Very variable in shape
  • Easy to simulate (not computationally intensive)
  • Amenable to statistical modelling via Likelihood and Bayesian analysis
27
Q

Describe the features of the neutral coalescent

A
  • Lineages coalesce very rapidly at start
  • A small sample will have high probability of containing the deepest MRCA
  • Adding another sequence usually adds a short branch
  • Ading a new branch does not change the total length of the tree by a factor of 1/k
  • Implies that inferences can be improved by getting lots independent trees (from different genes), rather than having very large samples for a single gene
28
Q

What are some uses of the coalescent?

A
  • Mathematical modelling describing diversity in observed data - derivation of parameters for describing genetic diversity, estimates of tree shapes and population sequence parameters in the neutral case
  • Simulation tool for hypothesis testing - tests for selection, changes in demography, migration and gene flow
  • Rosenberg and Nordberg 2002
29
Q

What is the F statistic?

A

Derived from Wright-Fisher model: A measure of the amount of shared co-ancestry between alleles within a population / or probability of identical by descent
- E.g., At generation 0, all allele copies are independent so none of them are identical by descent - so F=0
- This model also predicts the average time to fixation if a lineage is approximately 4N
- So F is increasing over time - similar to homozygosity