Molecular phylogenies 3: coalescents theory Flashcards
What is the coalescents theory and how do we use it?
Coalescents is genetic drift run in reverse, from sampled individuals back to a common ancestor (assumes neutral evolution).
It is a model that relates the mathematical relationship between population processes and the shape of phylogenies obtained from a population.
-> The assumption of Coalescent is used to build many different trees from geneological data.
-> Many trees which assume different parameters with different topologies are created.
-> MCMC is a hill climbing algorithm and is used to find the best fitting tree. (tweak parameters, better fit to data?)
-> Carried out by software like BEAST
-> BEAST can be used to estimate species abundance, extinction rates etc.
-> allows understanding of population genetics (population size, speciation events, Migration, gene flow)
No parameters: Null model -> Wright Fisher
Example parameter: Rate of coalesence
- Applying parameter that the rate of coalescence varies through time to estimate Ne at different time points
- Rate of coalescence = population size as Ne=1/Lambda
- bayesian sky line model
- If you know how Ne is changing through time for a pathogen, you can infer how transmission rates have varied.
Example parameter: migration
- Structural coalescents model
Overall:
- Coalescent simulations are carried out under different demographic scenarios to assess the liklihood of observed data under specific parameters. This provides insights about the demographic processes going on within populations.
Uses of the coalescents theory
It is a powerful way of looking at population genetics of populations.
Example:
- anthropology -> origins of humans
- Association mapping -> linking humans and disease
- Epidemiology -> tracking spread of infectious disease
Coalescence calculations
Sample i individuals from a populations and track them back to their common ancestor (can also use species instead of individuals)
r (prob of coalescents in previous generations)= (1/N) x i(i-1)/2
(probability that pair of sampled lineages share same parent) x (the number of possible pairs of sampled lineages)
it will decrease each generation as the number of sample individuals/ genes decreases.
Null model: Wright Fisher
- Individuals are of equal propensity to reproduce
- N is constant
- Non overlapping generations
Alternative model -> change model to include different parameters and see whether this fits data better
The effect of changing population size on coalescents
Increasing population size:
- long terminal branches (less likely for coalescence to occur in present when population large)
Decreasing population size:
- Short terminal branches (more likely for coalescence to occur in present when population small).
Sequence diversity
Theta
The sequence diversity of the population can be measured.
Sequence diversity = 2 x mutation rate x population size x sampled individuals
Uses of coalescence theory
We can gain insights into population genetics through the coalescents theory.
Mutations:
Tajima’s D- Statisic used to see what frequency mutations occur at
Population size:
skyline plots - estimation of past population sizes directly from phylogenies using coalescent methods.
e.g.) Skylin plots of the bison species showed that the population dropped following the last Glacial maxima so extinction was due to global warming no human emergence.
e.g.) effective population size of HIV was constant then rapidly grew (Growing pop in Western Africa? Urban population growth?)
Migration events:
- Do models including migration parameters fit the model better -> equivalent to Fst
Speciation:
- Predicting speciation from coalescent models
- When predicting speciation rate from tree modelled by coalescents theory need to be careful with incomplete lineage sorting -> the coalesence of genes predates speciation events in large populations.
Association studies:
Coalescent theory is used to interpret large-scale human genomics data sets to avoid artificial associations.
- Accounting for evolutionary history and population structure.
Structure vs unstructured populations
Coalescence in small population vs large populations
Small: More coalescence and coalescence time is shorter
Large: Less coalescence and coalescence times are longer. Long branches near root lead to mid-frequency polymorphisms.