Lecture 7- Coalescent theory Flashcards
what does coalescent theory refer to
the modelling of how alleles may have been derived from a common ancestor
limitations of this method
can only be used when positive selection is weak- assumption that all alleles are equally likely to be passed on
application
anthropology, epidemiology, association mapping (variation vs human disease), cancer bio
‘null model’ conditions
haploid population (N)
no strong selection
generations are non-overlapping
population size is constant
wright-fisher model
model of genetic drift
how does the wright-fisher model work (roughly)
traces the mutation through- e.g. looking at the probability it will become fixed/going extinct, can be used from the past or from the present to trace ancestry
what is a coalescent event
a point where there is shared ancestry
equation for probability of coalescence
(1/N) * i(i-1)/2
i = no of sampled lineages
represents probability that 2 lineages share a parent * number of possible pairs of sampled lineages
i(i-1)/2N
how can you work out time to MRCA from N and i
(2N(i-1))/i generations
how do you account for time
to look at r at a specific time, plug in the population size at the time r(t)=(i(i-1))/2N(t)
relationship of coalescent events and pop size
smaller pop = more events
sequence diversity relationship to mutation rate
sequence diversity = 2N*mutation rate
how does pop size impact sequence diversity
larger pop size = higher diversity because more time for mutations to emerge
methods of estimating population sizes from phylogenies
Tajima’s D, skyline plots
Tajima’s D
measures what kind of mutations are occurring- high/med/low frequency, etc- which can be used to infer the structure of a population, e.g. less low-frequency mutations at bottlenecks
skyline plots
look at the mathematical relationship between rate of coalescence and population size to calculate an estimated population size
example of a skyline plot being used to infer the history of a population size
HIV samples from the 50s compared to those from the 90s- there was already diversity, suggesting decent sized pop by this point
how can coalescent theory be used to track migration
can look at coalescence events which seem to suggest crossover of populations, can infer relationships between sub populations
issues that can arise when studying speciation using coalescent theory
can end up with incomplete lineage sorting- coalescence doesn’t match the speciation tree based on gene analysis
what are association studies
studies which aim to associate human genes with specific diseases, within populations- coalescent theory is useful for interpreting these large-scale datasets