Lecture 6 Flashcards
Examples of local adaptation in other organisms
- Rock pocket mice (Mice in the southern USA that occupy different substrates on deserts. Their coat colours have adapted depending on what type of substrate they’re on)
- Spine stickle backs, after the ice age, some were left in glacial lakes with no predators so adapted less heavy armour
- Dawins Finches in the Galopogas islands adapted different shaped beaks depending on their food resource
Whats the challenge of trying to detect local adaptation in humans?
We often don’t know which traits are adaptive or not so we could look at the genetic diversity/ lack of genetic diversity within a population
What was used for the first attempts at finding selective sweep regions
They developed extended haplotype homozygosity (EHH) test which uses the concept of core haplotypes
The approach tested on two genes known to be associated with resistance to malaria; allele 202A at G6PD confers 50% reduced risk
What are core haplotypes
Region of the genome- series of SNPs that are close by to one another and particular alleles in linkage disequilibrium with one another
How does the EHH work
- Define the core haplotypes in a gene
- For each haplotype examine all the individuals that carry that haplotype and “walk to the next SNP in each direction one direction at a time”
- Ask whether all those individuals are the same at that new SNP. If yes, keep walking; if no stop (where the extended homozygozity comes in)
- Measure how far each haplotype extends. Long haplotypes provide evidence of selective sweeps (extended haplotype homozygozity)
- Why? because, over time recombination moves alleles between haplotypes and so ones that extend a long way much have reached a high frequency recently. Usually this happens because they are under positive selection
Recent positive selection should result in these extended homozygous haplotypes
Showing EHH results graphically
Filled in circle- core haplotype.
Different branches coming away from it. Every time a branch splits, some of those individuals have different alleles to others.
The thickness of the branch indicates what proportion of people have that particular haplotype
Haplotype 8 looks like it has the best evidence for selective sweeps
A look at G6PD, core haplotype 8
- Ran a bunch of simulations to try and predict when a gene is neutral
- They did this for lots of simulations and they plotted core haplotypes on the data for those simulations
- Simulations determine the typical EHH
- Each dot represents a different simulated gene/ haplotype.
Haplotype 8 is a massive outlier so something non neutral is happening
HapMap phase 2 data
This dataset had 3 million SNPs, 420 chromosomes, 3 continents and 210 people with two copies of the genome
What’s the cross population extended haplotype homozygosity test (XP- EHH)
Compared two populations and tried to identify which population selection has happened in by looking at an allele that has reached fixation in one population but is still polymorphic in the other
How was selection detected in cross population EHH test
Used pair wise comparisons e.g. European-African, Asian-African which showed greater difference than European-Asian comparisons
Bordered and filled symbols are SNPs of likely functional importance 233 SNPs were derived allele at high frequency
Of which 39 were highly different between populations
and of those 5 were at low frequency in non-selected populations. Only 1 is filled in and is functionally important aka CLC245A which results in an amino acid change from alanine to threonine
SLC245A (solute carrier family 24 member 5)- distribution and what does it do
This derived allele is at high frequency in Europeans and low frequency elsewhere
This gene is known to be involved in pigmentation; involved in melanosome synthesis which are found in skin cells
The derived allele is associated with lighter skin
Light skin possible required for adequate vitamin D synthesis at higher latitudes
Whats the vitamin D synthesis hypothesis
Lighter skin has fewer, smaller, paler melanosomes (vesicles with melanocyte cells)
Dark skin protects against effects of ultra violet radiation (UVR), without it leads to sunburn (short term), cancer, nutrient degradation, neural tube defects (NTDs)
But UVR is required for vitamin D synthesis
In low sunlight conditions, vitamin D synthesis may be inadequate unless skin is pale (i.e. derived SLC45A allele will be favoured)
Humans only evolved lighter skin in Europe and Asia relatively recently
How common was/is this selection? (of SLC45A)
- 300 candidate regions
-22 of them exceeded a threshold never seen in 10Gb of simulated data
-Included the pigmentation gene, as well as genes associated with lactase persistence*, resistance to Lassa fever and hair colour/ density
Tibetan adaptations compared to other humans
Tibetan people live at >3000m without suffering from the effects of Hypoxia (insufficient oxygen).
Most of would be at risk of altitude sickness at altitudes > 2500m
Tibetan plateau has been populated for more than 10,000 years
What was the rationale, predictions, approach done behind the tibeten studies
- Any regions involved in high altitude adaptations should be different between those two populations
- Prediction: greater differentiation at relevant genes than elsewhere in the genome
- Approach: compared 500k SNPs between Tibetans (n=35) and Han Chinese (n=84)
What were the results from the Beall et al paper
- Looked at the genetic variation between the two populations and made a Manhattan plot to look for signatures in genetic variation
Y-axis is the statistical significance of the difference in allele frequencies
SNPs around EPAS1 on chromosome 2 are highly divergent
What is EPAS1
Most people produce more haemoglobin/red blood cells (known as erythrocytosis) at high altitudes. Tibetans do not. Thickening of blood causes chronic mountain sickness
EPAS1 encodes for a transcription factor called Hypoxia inducible factor 2 alpha / HIF2a
HIF2a regulates erythropoietin (EPO) which stimulates red blood cell production
EPO is a drug in which endurance athletes use to cheat)
What does the Simonson paper discuss
Used gene ontology databases to identify 247 functional candidate genes
Typed 31 Tibetans at 1 million SNPs and compared them to 45 HapMap Chinese/ Japanese samples
Cross population extended haplotype homozygosity test to compare the populations
Integrated haplotype score (iHS) test to find regions that have undergone sweeps within a population
Genes of interest - Simpson paper
Looked at genes within small regions of those SNPs: EPAS1, EGLN1 (significant).
EPAS1 is significant under both tests
Yi paper
Sequenced the exome of 50 villagers from Tibetan villages
Sequenced them at 18 x coverage of 20,000 genes (34 million base pairs)
About 1% of our genome- efficient way of sequencing what’s potentially the most intricate part of the genome
They compared the sequences with 40 genomes of Han Chinese from human 1000 genomes project
Population branch statistic- what did they find
To estimate which population natural selection had occurred in
Two closely related populations - compare them both to a more distantly related population and work out how many differences there are between the populations - measure that as a branch length
EPAS1 was number one in the whole analyses
EPAS1 showed associations with erythrocyte count and haemoglobin levels
In both cases “Tibetan allele associated with lower values”
Han and Tibetans only diverged about 2500-3000 years ago - fastest known example of divergence between human populations due to natural selection at a single gene
Summary of papers
- 2 genes repeatedly shown as being targets of selection
-Both genes are part of the hypoxia-induced factor (HIF) pathway
Follow up paper exploring region around EPAS1 in more detail
Sequenced the region around EPAS1 in 40 Tibetans and 40 Han Chinese
477 SNPs with especially high Fst in 32.7 Kbp region
Inferred EPAS1 Haplotypes
They defined all the different haplotypes in different human populations
- Core haplotypes based on 5 SNPs
- Most human populations have haplotype GAAGG at high frequency
- In some African populations there’s more diversity- typical for most of the genome. The one that stands out - Tibetin population
- Tibetans have AGGAA which is also seen in Denisovans but only rarely in other humans
What did the Neighbour-joining tree for tibetan EPAS1 haploytpes show
Tibetan haplotypes have common ancestry to the Denisovan populations (more closely related to denisovans than any other human population)
This suggests that when humans and denisovans mixed, that haplotype got introduced to humans from denisovans and is found in a frequency in tibetins because it was advantageous once humans moved to high altitudes
What was revealed in May 1st 2019
Denisova is a long way from Tibet, there’s no evidence that denisovans even lived in Tibet
AMHs have been in Tibet for ~30-40KY
- Subsequently, a jaw bone was found on the tibetin plateu
- Fossil was carbon dated to ~160KYA
- No DNA retrievable from Tibetan fossil, but protein sequence obtained from a tooth
- Fossil was a Denisovan – presumably admixture within Tibet