Lecture 8 Flashcards
Measures of nucleotide diversity
2
- Tally the number of segregating sites (s)
- Calculate the average pairwise divergence between alleles (pi)
Descriptive metrics include:
7
- Describe the DNA sequence in different ways
- S
- P
- MAP
- Haplotypes
- Pi
- H
S:
2
- Number of segregating sites
- Can calculate theta from S
p:
The frequency of a variant
MAP:
Minor allele frequency
Haplotypes:
A particular combination
Pi:
2
- Average pairwise divergence
- Under the neutral theory we expect pi = theta
H:
The frequency of heterozygous
Metrics estimated with some uncertainty include..
2
- Meu
- Fst
Meu:
Mutation rate
Fst:
Population structure
Theoretical metrics:
3
- Ne
- Hexp
- Theta
Ne:
Effective population size
Hexp:
4
- The expected number of heterozygous
- 2pq
- 4Nemeu/1+4Nemeu
- Theta/1+theta
Theta:
4
- The expected nucleotide diversity
- 4Nemeu
- Under the neutral theory we expect theta = pi
- Can also calculate theta from S
Tajimas D test:
- Compares theta (neutral expectation of molecular diversity) to pi (average pair wise divergence) to address frequency spectrum deviations.
Coalescence:
3
- Looking back in time the lineages of all contemporary alleles will eventually coalesce to a single ancestor
- There are n-1 coalescence per sample size of n
- On average, it will take 2Ne generations to go from a large sample down to 2 lineages
The effect of bottleneck on the coalescent:
4
- Population goes through the bottleneck and then expands again
- This will give us a coalescence tree with two arms separated by really long branch lengths.
- We will find most mutations occur in the two long arms
- Increased number of intermediate frequency, positive Tajimus D.
Positive Tajimus D:
3
- Increased number of intermediate frequency
- Bottleneck is an explanation for positive TD
- Deficit of rare variants
Negative Tajimus D:
2
- Excess of rare variants
- Coalescence with expanding population size
Do polymorphisms in nearby sites evolve independently?
2
- No! Not always
- Linkage equilibrium is a measure of this
Linkage Disequilibrium (LD): (2)
- Non-random association of alleles at different loci in a population
- Correlation between different sites that can be near each other, or on different chromosomes
Haplotype:
2
- One combination of allelic states that is inherited together
- Defined by an arbitrary number of sites
How does linkage disequilibrium get established in the first place?
(4)
- Imagine 3 linked sites in a linkage equilibrium (of 8 haplotypes)
- A mutation occurs and increases in frequency by drift or selection
- The LD is maintained
- Recombination erodes LD over time
The probability that the haplotype is not broken down in G generations..
(2)
- P=(1-c) to the power of G
- Where c is the recombination rate between two polymorphisms
Factors affecting LD include
6
- Mutation
- Drift
- Limited recombination
- Admixture
- Bottlenecks
- Selective sweeps
Admixture:
2
- The mixing of two genetically differentiate populations
- LD is established between adjacent sites and different chromosomes
Extended LD:
Dependent on migration rate and recombination rate