W4L1 Molecular Population Genetics and more tests of Neutrality M Flashcards
What is the advantage of neutral theory
it gives us expectations about molecular evolution in the absence of selection
Way of identifying changes in nucleotide
-Measure every polymorphic site
-measure minor allele frequency and count total amount of singleton via frequency spectrum
Measuring nucleotide diversity
-create a table which compare the amount of difference between each alleales polymorphic site
-create a average pi
Descriptive metric unit
- S (the number of segregating sites)
- p (the frequency of a variant)
- MAF (minor allele frequency)
- Haplotypes ( a particular combination)
- p (average pairwise divergence)
- Hobs (the frequency of heterozygotes)
Some theoretical metrics to give expectation
- Ne : The effective population size
- Hexp : the expected number of heterozygotes
- 0: ‘theta’. The expected nucleotide diversity and we expect theta=pi
Calculating theta by waterson
Waterson (1975) found that q = 4Neµ and it can be calculated from S.
Calculating theta under the neutral model
Tajimas D test
compares q (neutral expectation) to p (observed) to address frequency spectrum deviations
Tajimas D = p- q/
Sqr Var(p- q)
-if the value is negative, there is an excess of rare singelton
-if the value is positive, there is a lack of rare singleton
What would the reason be for the deviation between pi and theta
-population size does not stay the constant
coalescence
Looking back in time the lineages of all contemporary alleles will eventually “coalesce” to a single ancestor
There are n-1 event coalescence per sample size of n
On average, it will take 2Ne Generations to go from a large sample down to two lineages and the total for coalescent is 4Ne
The effect of a bottleneck on the coalescent
- a positive tajimaD, fewer mutation than expected if there is a recent bottleneck
-negative tajimaD, more mutation than expected if the bottleneck event is long ago
Coalescence with expanding population size
-a negative TajimaD result can indicate a population boom
What can TajimaD tell
-if there is a size fluctuation in the past
TajimaD and cotton bollworm
Genomic study of 141 individuals of cotton bollworm, which were collected from 13 locations in three cotton-producing regions of China, namely the Yellow River Region (YRR), the Changjiang River Region (CRR) and the Northwestern Region (NR)
* 5,227,071 high-quality SNPs
* Mean Tajima’s D values of different populations ranged from −1.22 to −0.67 show that there is a population expansion recently
Linkage Disequilibrium
the non-random association of alleles at different loci in a population
What is a haplotype
is one combination of allelic states that is inherited together
How is haplotype created
-at first, there is 3 linked sites, after a few generation, drift may bring the new mutant-bearing allele to higher frequency, creating linkage disequilibrium in preexisting polymorphism. But it can be erode by recombination overtime
Loss of haplotype overtime
Let’s call the recombination rate between two polymorphisms c
The probability that the haplotype is NOT broken down in one generation
P=1-c
The probability the haplotype is not broken down in G generations is:
P=(1-c)^G
Homosapien and Neanderthal cross breeding
Admixture: the mixing of two genetically differentiated population
Initial Linkage dismorphia will be porpotional to alleles frequency differences between the two population and is unrelated to the distance between the markers
Extended LD: dependent on migration rate and recombination rate
Factors affecting LD
Mutation, drift, limited recombination
Demographic effects –Admixture
–Bottlenecks eg. Founder effects
Selective sweeps
Selective sweep and LD
Left in the patterns of nucleotide polymorphism
* Locally reduced variation
* Skewed frequency spectrum negative TajimaD
* Increased LD/haplotypes
FOXP2 selective sweep example
- mutations associated with severe articulation problems, linguistic and grammatical impairment
-it is a regulator for a lot of gene
-only 2 amino acid between mouse and chimp but 2 difference between human and chimp
-but it turned out that the result is due to sampling bias