L31 Functional Human Genetics 1 Flashcards
The case of the missing heritability
see onenote
- traits are more complicated than we thought
- decyphering how genotypes impact phenotypes is hard
What do we mean by heritable?
see onenote
Heritable traits
- E.g. height, eye colour
- Proportion of phenotypic variation due to genetic variation in a population
- Broad sense heritability H2
- Narrow sense heritability h2
- Trait needs to be variable for it to be heritable (due to the way we define heritability)
How much of a trait is due to genetics?
- how do we measure
see onenote
- Partition phenotypic variation into genetic and environmental variation
- Twin studies
- Monozygotic twins - Vg = 0, any differences would be due to the environment
- Dizygotic twins don’t have the same genotype but presumably they have the same environment
- comparison between parent and offspring
- More heritability, more additive variance = steeper slope
- E.g. h2 = 0.8, 80% of variation in height is due to genetics
- Heritability of trait changes as environment changes, heritability is not a fixed value through time and space
Broad-sense and narrow-sense heritability
see onenote
Vg = Va + Vd + Vi Vi = epistatic variance
narrow sense heritability
h^2 = Va/Vg
Heritability in a changing environment
see onenote
Heritability estimates for some complex traits
see onenote
Simple (mendelian): 1 locus
Complex
- oligogenic: 2-10 loci
- multigenic: 10-100 loci
- polygenic: 100+ loci
Means of going from phenotype to genotype
see onenote
linkage mapping
- Use a segregating pedigree to construct a linkage map
- Requires extensive pedigree to reach significance
- Trait with complex architecture difficult to identify causal variants
candidate gene association mapping
- test one or a handful of pre-selected loci for association with trait in cases and control
GWAS
- Test for association with many markers e.g. SNPs
- SNPs are either the disease causing variant or is in LD with the disease causing variant
- Region in LD with the SNP is significant in the trait
- Cryptic carriers in control group will decrease power e.g. don’t have diabetes yet but will soon get it
- Super control = low insulin resistance values (not resistant to insulin), probably won’t get diabetes
Pitfalls of linkage and association mapping
see onenote
linkage mapping requires pedigrees to reach significance
- suited to monogenic disorders
candidate gene mapping relies on a prior assumptions about trait aetiology and causality
- meaningless if loci is misidentified
- much contributing variation can be missed
GWAS for human traits
- allows us to test millions of polymorphisms for association with our chosen trait/disease
- we should be able to identify statistically significant associations between variants and trait in the sampled population
crucial assumptions
- significant SNPs are either the disease-causing variant (rare) or in LD with the disease causing variant (common)
The HapMap empowered GWAS
see onenote
- to successfully conduct GWAS we need a precise map of the LD structure of the genome to map traits back to genotypes
- by including trios (father-mother-child), HapMap enabled us to generate one
Inferring haplotypes and linkage blocks
see onenote slides
- phasing is the act of deducing haplotype structure from genotype data
- once we have identified haplotypes empirically using our trios, we can use probability to phase unrelated samples
From haplotypes to LD
see onenote slides
- haplotype blocks can be formalised into LD blocks
- strength of LD is capture by r^2
r^2 = the square of the correlation coefficient between two loci
Why is accurate phasing important?
see onenote
- GWAS rely on tag SNPs
- tag SNP = a SNP that summarises variation within an LD block
Case-control GWAS - the classic design
see onenote
- compare cases and control sampled from the same population
- cryptic carriers in control group will also decrease power e.g. pre-symptomatic individuals
Association testing
see onenote
The success of GWAS depends on”
- well differentiated case and controls drawn from the same population
- sufficient statistical power to detect significant associations
Matters of power and significance
see onenote
- community thresholds for significance are now p<5x10^-8
- should lower the rate of false positives to almost zero
Manhattan plots
see onenote
- provide an easy to visualise the results
Early GWAS was promising
Mycordial infarction
- age-related macular degeneration
- identified a small number of loci with large effects on disease risk
- these loci could explain a substantial fraction of the h^2 estimates
Was the human genome project worth it?
see onenote
Wellcome Trust Case Control Consortium Phase 1
see onenote slides
14000 cases across seven diseases, plus 3000 shared healthy controls
The GIANT consortium
see onenote
Defining the role of common variation in genomic and biological architecture of adult human height
- subsequent GWAS failed to replicate early successes in the field,, even as sample sizes increased
GWAS catalog
- publicly available curated resource of all published GWAS and association results
The heritability of most studied traits remains poorly explained by GWAS hits
see onenote
some possible explanations:
- rare variants of large effect
- low penetrance
- epistasis
- etc.
GWAS often gives little insight into the biological mechanism underlying that association
A matter of power
see onenote
- early GWAS were underpowered to detect most associations
strength of association between trait and genotype depends on:
- effect size of variant
- penetrance of variant
- freq of variant
- quality of case/control separation
all of these interact in complex ways
- a rare variant of large effect will not be detected in a GWAS, unless the sample size is very large
- nor will a common variant of large effect but low penetrance
A closer look at height
see onenote
- many of the SNPs and loci were in genes that had been previously implicated in skeletal disorders