Lecture 6 Flashcards
Jukes and Cantor
2
- Corrects for multiple hits
- Assumes all nucleotides are equal
Fst:
A derivative of the Hardy Weinburg equation
Hardy-Weinberg Equilibrium:
3
- Relating gene frequencies to allele frequencies
- AA, Aa and aa, are related p2, 2pq q2
- If the gene frequency (p) of A = 80/100 the expected genotype frequencies are
(0.8 x 0.8) = 0.64 AA,
(2 x 0.8 x 0.2) = 0.32 Aa
and (0.2 x 0.2) = 0.4 aa
What is the heterozygosity of trapped mice in the East (aa) West (AA) population with cats patrolling the middle?
(3)
- Assume we catch as many eat mice as west mice..
- Expected heterozygosity (Hexp) = 2pq = 2 x 0.5 x 0.5 = 0.5
- Observed heterozygosity -(Hexp) = 0
Wrights fixation Index (F):
5
- F = (Hexp - Hobs)/Hexp
- The deviation between expected and observed normalized by expected
- F = 0, then HWE, no population structure
- F = 1, there are no heterozygotes, population structure
- The closer F is to 1 the more structure there is in the population
- Tells us how much the population is out of HWE
A deficit in heterozygotes can arise due to..
A cryptic population structure
When chi-squared gives P<0.001
- Far fewer heterozygotes than we would expect
- Deficit of heterozygotes
- Wahlund effect
Wahlund effect:
- When a sample from a population shows that there are actually two populations
- A deficit of heterozygotes is shown
Linanthus parryae the ‘desert snow’ helps us study..
- The scale of population structure (regional, global), using the F indices
HT=
Total heterozygosity
HS=
Subpopulation heterozygosity
HR=
Regional heterozygosity
Using the Wrights Fixation Indices (F) we can answer these questions:
- How much of the deviation from HWE is due to
- Sub-population structure
- Regional population structure
- Population wide deviations
- Fewer heterozygotes at the individual level
FSR =
The decrease in He among subpopulations within regions normalised by He in regions
FRT =
- The decrease in He among regions within whole populations normalised by He in whole populations
- A measure of differences of heterozygosity between geographical regions
FST =
- The decrease in He among subpopulations within whole populations normalised be He in whole populations
- Calculating F stat based on data from sub populations, vs the total population
FIT =
The individual vs the total population, the range that these values take is -1 through 1 (because all individuals may be heterozygotes
Fst is the most commonly used because
3
- We just need gene frequencies from ‘subpopulations’
- It is the most informative way to relate subpopulation sample to the total population
- The proportion of the total heterozygosity in the population that is due to differences in the allele frequencies among subpopulations
Fst = 0 - 0.05
Low genetic differentiation, very low population structure
Fst = 0.05 - 0.15
Moderate genetic differentiation, moderate population structure
Fst = 0.15 - 0.25
Great genetic differentiation, great population structure
Fst = 0.25 - 1
Very great genetic differentiation, very great population structure, lots of gene flow between them
Hapmap project in humans
- Sample from four human populations Central European Populations, Yoruba (Nigerian tribe), Chinese and Japanese
- Figure out how much population structure there is between human populations
The hapmap project found:
- Fst of over 2.8 millian SNOS is 0.11 (moderate structure), some sites have structure, others don’t
- Of the total genetic variation observed between these four ethnic groups, only around 11% is due to genetic differences among groups
- ie: about 89% of the variation is common
- We observed the same variants in all populations.
Why would Fst differ across the genome?
- In neutral sequences Fst is determined by drift and demography (spread and bottle necks) and chance
- Local positive selection increase Fst
- Balancing and negative selection decreases Fst
Which variants showed us the most variation across human genomes?
- The lactase gene (Lct)
- For many people the ability to digest lactase disappears in childhood (ages 3 - 4)
- Lactase persistence is in high frequencies among populations with histories of cattle/dairy farming
- This is due to natural selection favouring the lactase persistance allele