Recombination Flashcards

1
Q

What is a haplotype?

A

A set of DNA variations that tend to be inherited together
- Chromosome has enormous anumber of variable sites
- So different chromosomes will have different combinations of alleles on those sites (loci)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do haplotypes work in diploids?

A
  • Typing technique provides the genotype for each locus however one does not know the combination of the alleles in each of the chromosomes
  • E.g., individual can be heterozygous for locus A (A1/A2) and heterozygous for locus B (B1/B2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are ‘phases’ of haplotypes?

A

The haplotypic combinations present

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What methods can we use to find out the haplotype phases?

A

It is challenging:
1. Allele specific PCR and Next Gen Sequencing
- Can get sequences for whole chromosome arms - ‘long read’ NGS
- Alleles yield PCR products of different sizes
- Correct combination of allele specifieic primers to amplify haplotypes
- Algorithmic analysis of sequences

  1. Somatic cell hybrid
    - More experimental approach
    - Fusion of mouse of human cells
    - As you propagate cell lines, because cells have chromosomes from mized species - so chromosomes are ejected from cell
    - Progressive loss of chromosomes during cell division
    - End up with select cells which only contain single human cells
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is recombination?

A

The shuffling of chromosomes segments to generate a new haplotype combination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the generation of new haplotypes caused by?

A
  • Most often - Recombination
  • Mutation - less common
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When / how does recombination occur?

A

During meiosis - when maternal and paternal chromosomes are aligned and cross over
- Humans - recombination rate ~ 1 to 10 events per chromosome
- Occurs throughout genome - but there are ‘recombination hotspots’ as well as ‘cold regions’
- Can cause gene conversion (non-reciprocal) recombination or conventional (reciprocal) recombination
- The further apart 2 sequences - the higher the probability of recombination between them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why is recombination important?

A

Modulates / influences each of the 4 evolutionary forces dicussed in these lectures
- Fundamental part of sexual reproduction
- Creates novel combinations of genes
- Purges genome of deleterious mutations (removes them)
- Increases efficiency of natural selection - reduces interference between loci under different selection regimes
- Responsible for different sequences having different ancestral histories - increases the information available from the past but also increases the complexity of its analyses
- Can be exploited to infer population history - (e.g., new selection tests, admixture)
- Can be exploited to help locate genes of interest - (e.g., disease loci in humans)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is there variation in recombination in different parts of the chromosomes?

A
  • Due to recombinogenic motifs that are found across the human genome
  • These are recombination hotspots
  • Motifs often found in transposable elemenst sequences
  • Rate of recombination drops off rapidly away from motif
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do population processes and recombination interact with the strength of linkage on chromosomes?

A
  • One hand - population processes e.g., demographic changes and positive selection - increase linkage / linkage disequilibrium
  • Other hand - recombination - reduces linkage - so high recombination rate - decreases linkage
  • Genetic drift - increases linkage between sites on a chromosome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is linkage disequilibrium?

A

LD is the non-random association of variants of different polymorphic sites between alleles in the population - alleles at different loci
- E.g., When loci have LD = we can predict the variant at one site if we know the variant at another site
- So - sites close together are less likely to have recombination between them - so are more likely to be in LD
- E.g., if no linkage exists (linkage equilibrium) - frequency of each haplotype corresponds to product of allele frequencies
- Recombination decreases LD in each generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is linkage disequilibrium important in studying?

A
  • Important in studing selection, demographic scenarios (like admixture) and in association studies
  • LD can tell us that a demographic event may have occured
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What processes increase LD?

A
  • Positive selection (selective sweeps)
  • Drift in small populations
  • Population growth
  • Population structure and admixture
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you measure LD - quantify strength of LD?

A
  • Lewontins D and D’ - based on frequencies of alleles and haplotypes in populations
  • r^2 - based on correlations among sites versus chromosomal distance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do linkage equilibrium/disequilibrium differ?

A
  • LE: alleles at different loci are associated in proportion to their allele frequencies
  • LD: alleles at an individual loci show association or dissociation relative to their allele frequencies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

For Lewontins D - what do the D values indicate about LD?

A

D relates to the expected haplotype frequencies as a measure of the deviation from the pattern under random assortment and free recombination
- D = 0 - linkage equilibrium
- D > 0 - Association between alleles - occurs together more than expected - coupling phase
- D < 0 - Dissociation between alleles - occurs together less than expected - repulsion phase
- D usually varies between -0.25 - 0.25

15
Q

For linked markers, how does LD change with chromosomal distance?

A
  • LD decreases as a function of chromosomal distance due to recombination
  • Size of chromosomal blocks in LD extends over can be informative - form part of hypothesis testing around demographic/selection processes
16
Q

What is admixture?

A

Two genetically distinct (isolated) populations coming together

17
Q

How does LD work in admixture?

A
  • In admix - variable genes from two populations start off in LD
  • Since LD in random mating pops decreases over time - recently admixed pops have long-range LD whereas ancient admixes have short-range of little LD
  • Can use information to date admix time - association studies - determine when two populations mixed together
18
Q

What does the amount of LD in a population depend on?

A

Amount of LD in populations depends on recombination rate + when an admixture event may have occured in past
- Recombination high = LD reduces rapidly
- Recent admixture = high LD, ancient admixture = lower LD

19
Q

Give an example of an admixture event in humans

A
  • Lemba - Bantu-speaking Africans who claim Jewish ancestry - confirmed by Y-chromosome studies
  • Can compare observed Lemba LD with predictions from computer simulations of mixing Bantu with Jewish (Ashkenazi) populations
  • LD persists much longer in Lemba - what you would see in an admixed population based on predictions
20
Q

How can you think about the coalescent with recombination?

A
  • Recombination in coalesence tree (a) means that different parts of sequence have different trees (b)
  • If we combine them (c) - we no longer have a strictly branching tree - but a network
  • This is the ‘ancestral recombination graph’ : the basis for modelling the coalescent with recombination
21
Q

How do selection forces affect diversity at linked loci and give an example?

A

Forces at one position affect the surrounding areas of the genome:
- Purifying selection: reduction in diversity at linked loci (background selection)
- Positive selection: hitchhiking effect leads to selective sweep
- e.g., selection on tb1 gene during domestication of maize - loss of nucleotide diversitiy (PI) in 5’ upstream area regulatory region of maize (caused by selective breeding) - localized loss of diversity is important indicator of where positive selection may have occurred

22
Q

Explain the process of hitchhiking and selective sweeps?

A
  • Target allele is pushed up to high frequency due to positive selection (selective sweeps)
  • But other alleles at different loci that are linked on same genetic background are also pushed up to high frequency - hitchhiking
  • Recombination is actively trying to break up LD caused by e.g., drift
23
Q

How is the amount of LD determined?

A

Depends on recombination rate and intensity of selection pressure:
- Selection pressure mild + high recombination rate = Low LD and selective sweep will be minor and constrained to small area
- If selection pressure strong + low recombination rate = LD is high and selective sweep intense over much larger area of chromosome - so even very distant alleles are swept up to high frequency

24
Q

How would you use hitchhiking and selective sweeps for testing?

A
  • Effect of selective sweep higher in sequences adjacent to selected sequence and it decreases with distance and so LD decreases
  • Can test for selection
  • But need a good estimate of recombination rate parameter
25
Q

Give an example of a test that can be used to detect and quantify LD

A

Long-Range Haplotype (LRH) test:
- Measures relationship between allele frequency and extent of LD
- Long-range LD develops when rise of frequency of advantageous alleles is faster than decay of LD in the haplotype
- Extended Haplotype Homozygosity (EHH) - probability that two randomly chosen sequences carrying the core haplotype/SNPs are identical by descent (similar to homozygosity)
- Decreases to 0 at increasing distances
- +ve selection is detected considering high EHH
- Generate EHH scores for alleles at diff loci - along genome: lower score = more recombinants - look for sites where EHH is higher than normal

26
Q

Give an example of this LRH/EHH process?

A

Lactase gene haplotype on chromosome 2 - is longer in Europeans - suggesting strong selection
- Identified more than 250 candidates for genes selected for - relating to pathogen resistance, metabolism (diet) and brain development
- Indicates recent selective sweeps in European populations - due to development of farming/agriculture - less strong selection in African populations

27
Q

Give an example of how you can do genome-wide detection of positive selection

A

Using XP-EHH - detects selective sweeps where the selected allele rises to high frequency in one population but remains polymorphic in the human pop
- e.g., SLC24A5 - in natural skin color variation - Europeans vs West Africans
- Derived Ala111Thr allele at SLC24A5 gene influences light skin tone of Europeans - polymorphism may account for 25-40% of difference between Europeans and West Africans

28
Q

What patterns would be expected for deleterious background selection?

A
  • Deleterious alleles (that reduce fitness) are constantly being generated in population
  • Purifying selection will remove deleterious alleles
  • Linked neutral variation will be removed along with deleterious alleles
  • However, this results in the loss of only the specific haplotype(s) on which the deleterious mutation arose - leaving other variation unaffected
  • Loss of neutral variation: Expect a slight reduction in amount of diversity relative to the expectation for a case with no selection at all - but less strong than a strong selective sweep
  • Is dominant form of selection in human genome
29
Q

What might this deleterious background selection look like on a gene genealogy?

A

Is visualised by losing just a few terminal twigs from the gene genealogy

30
Q

How does this deleterious background selection compare with the positive selection scenario and what happens to the gene genealogy?

A
  • +ve selection scenario drives a haplotye to high frequency - substantially reducing linked variation - lots of branches and twigs lost from gene genealogy
  • Deleterious background selection - only specific haplotypes are lost - where deleterious mutation arose - so only a few terminal twigs are lost from gene genealogy
31
Q

Directly compare features of selective sweep vs background selection
(type of selection, effect on Ne, excess of alleles and genealogy tree effect)

A

Selective sweep:
- Caused by positive selection (rare)
- Reduction in neutral diversity (Ne)
- Excess of low frequency alleles - singletons
- Short internal branches in coalescent tree

Background Selection:
- Caused by purifying selection (common and continual)
- Reduction in neutral diversity (Ne) - removes individual haplotypes from tree
- Frequency spectrum similar to Neutral expectation - because only removing individual haplotypes
- Phylogenetic tree will appear the same as neutral coalescent tree

32
Q

How does linkage disequilibrium differ depending on population size?

A
  • Small population: high drift = lineages share common acestor in recent past - so sites will be tightly linked (lots of linkage) = high LD
  • Large population: lower drift = common ancestors of lineages in distant past = lots of time for recombination to reduce LD even between adjacent sites