Multifactorial Disease Genomics - week 8 Flashcards
what are single gene disorders called?
mongenic and mendelian
features of single gene disorders?
rare, specific pattern of inheritance in family, disease caused by pathogenic variant
example of single gene disorder
Cystic Fibrosis – mutation in a single gene (recessive or dominant) that will always have the same phenotypic affect
The majority of morbidity and mortality in 21st Century Britain is the result of
genetic and environmental factors
Multifactorial are a Combination of both
genes and non-genetic(environmental) factors that determine disease risk
o Diseases such as Non – communicable diseases (type 2 diabetes, heart disease and cancers)
How do we know common traits have a genomic component? (twin studies + family studies)
GOLD STANDARD: TWIN STUDIES
Do both twins show the same characteristic or trait? Comparing MZ/DZ twins can give evidence for genetic and/or environmental influences
MZ twins share
all their genes and environment (genetically identical)
o Much more physically similar than DZ twins
DZ twins share
50% genes and environment
By looking at various traits i.e. body fat distribution, height and weight and by comparing DZ and MZ twins you can see how
concordant the two are.
Concordance = For a trait like height monozygotic twins are
95% concordant compared to 52% of dizygotic twins so height is a highly heritable trait – estimated that it’s about 80% heritable
Concordance = BMI is slightly
less heritable but still heritable as concordance higher in monozygotic than in dizygotic
Parent offspring correlations
Another way of measuring heritability is looking at correlation between parents and offspring for a particular trait
How do you conduct parent offspring correlations?
Plot average of parents vs offspring’s phenotypes
If no genetic affect of this trait then you’d expect both phenotypes to be random
If there is a correlation then if gives you a slope of the line – indication that the trait is heritable
Look at the correlation using the “slope” of the line
The Genetic basis of common polygenic disorders
(What is the difference between single gene disorders and common disease with partial genetic component)
How do we measure genetic basis of common polygenic disorders? background
These common polygenic diseases are likely to have many DNA changes, each predisposing to disease (type 2 diabetes, cardiovascular disease, schizophrenia, …) (unlike monogenic diseases – you can say that a particular variant is causative)
How do we measure genetic basis of common polygenic disorders?
So you could look at a case control study – looking at a particular variant in cases, present in 33% of cases and only 16% of the controls – indication that this particular variants is associated with your disease.
For complex disease traits a gene variant is present more
frequently in cases than controls
Variant status (genotype) provides
probability of disease status
How do we measure genetic basis of common polygenic disorders? another way of measuring
Common polygenic quantitative traits many DNA changes, each influencing levels of a trait (height, BMI, blood pressure, CRP, LDL, HDL, triglycerides, fasting glucose,… - no case control).
For quantitative traits, compare the mean trait value (mean BMI) between the three genotype groups. If the 3 groups are different then that’s an indication also that there’s an association with a particular locus.
Before 2007
Candidate Gene studies: Before 2007 you could do association studies but gene by gene in a candidate gene method
o i.e. gene on chromosome 12, gene z involved in height I’m going to test it for an association – if no association then have to test the other 24,999 genes
2007: Genome-wide association studies
Genotype and test thousands of SNPs across genome (hypothesis free approach – no prior knowledge of the trait needed)
“Genome-wide significance” example BMI
Here’s the quantitative trait for BMI, 3 genotype groups, and can test whether the difference in mean BMI is significant different in the 3 BMI groups using linear regression.
Association statistics derived from regression analyses
“Genome-wide significance” example BMI Linear regression looking at
deviation from a linear relationship
“Genome-wide significance” example BMI: vertical black line
average change in BMI per BMI increasing allele
“Genome-wide significance” example BMI: Generate a P-value returned for each SNP tested:
probability of observing an association by chance when no association actually exists
“Genome-wide significance” example BMI: common significance threshold?
Common to use P=0.05 as significance threshold to be classified as “significant”
5% probability (1 in 20) of seeing the observation by chance
“Genome-wide significance” example BMI: p = 0.05
There are around 1M independent common SNPs in genome after accounting for linkage disequilibrium
P=0.05 would mean we would expect 50,000 associations at that level of significance by chance!!!
Solution: correct for number of test: 0.05/1,000,000 = 5x10-8 (known as Bonferroni correction)
(P value divided by number of tests doing (estimate 1 million))
GWAS results Manhattan plots show
Manhattan Plot showing association of genetic variants across the genome with type 2 diabetes
Each point represents a
SNP.
The position of each SNP relates to:
1) its position in the genome (X-axis)
2) It’s strength of association against T2D when tested using x amount of T2D patients
GWAS results show common variation in
at least 97 gene regions predispose to obesity
There is a correlation between
the size of the effect of genetic variation and frequency in population
Mendelian category
There are many diseases that fall in the mendelian monogenic disorders which are very rare but have a high penetrance so most of the time will cause exactly the same disease in everyone who carries it
GWAS variants
GWAS variants fall in the category on the right – these are common variants at a frequency of greater than 0.1% and these generally have a much smaller effect so need multiple variants
Middle category
Now starting to understand more about the genes in the category in the middle – relatively uncommon (low frequency) but have an intermediate penetrance so fairly big affect but not near 100% penetrance
The other categories – less known
o Low penetrance and very rare = very hard to identify
o High penetrance and very common = uncommon that these would exist as generally they would be selected against.
Limitations of GWAS
larger sample sizes required,
variants have small effect sizes,
causal pathway, gene and variant not always determined,
method based on SNP variation
ethnic groups studied to date have been limited
Pleiotropy is common
Limitations of GWAS: larger sample sizes required
no plateau effect seen yet (the larger the sample size the better – as you increase sample size you see more and more variants, so lots of traits have huge numbers of genetic variants all contributing a small amount to the phenotype, the more you find the lower the effect size)
Limitations of GWAS: variants have small effect sizes
Each individual variant has a very small effect sizes so might only affect height by only a few centimetres
Limitations of GWAS: causal pathway, gene and variant not always determined
Causal pathway, gene and variant not always determined – to be able to identify the causal relationship then functional work required
Limitations of GWAS: method based on SNP variation
The method is based on SNP variation and we know that SNPs do not cover all variation in the genome, e.g. There are larger copy number variations and multi-allelic variants which won’t be covered well by SNPS
Limitations of GWAS ethnic groups studied to date have been limited
Ethnic groups studied to date have been limited – do results apply across all groups?
• GWAS data existing is very biased towards populations of European ancestry
• This is an active area of research so difficult to apply findings to other ethnic groups
Limitations of GWAS: Pleiotropy
- Pleiotropy is common – variants associated with multiple traits
- (sometimes this can help but sometimes makes it more difficult to determine the effect of a particular variant is)
GWAS benefits:
new technology
determine causal relationships between traits and disease
predicting who will get disease
drug discovery
GWAS benefits: new technology
• Huge benefits including understanding new biology and this is the case for all of the GWAS studies as it’s uncovered new biology as the approach doesn’t look at candidate genes it’s hypothesis free so you can use GWAS to find new areas of biology associated with your trait
• It allows you to dissect the biology of type 2 diabetes
o Type 2 diabetes – a number of genes near common variants were identified and by looking at what the genes are, they’re expression in different tissues you can start to partition them into different groups
Transcription factors, appetite control, BMI, beta cell effect
So it helps to understand the fundamental etiology of type 2 diabetes
GWAS benefits: determine causal relationships between traits and disease
Second important area is being able to identify causal relationship between traits and disease
GWAS Problem: correlation does not equal causation
- just because 2 things are associated doesn’t mean it’s causative i.e.
- people in Florida are more likely to have Alzheimer’s. Does that mean that living in Florida causes you to develop Alzheimer’s?
- Probably not – more likely that older people move to Florida and older people develop Alzheimer’s – this is a confounding factor
• Association between coffee drinking and pancreatic cancer, people who smoke are also more likely to drink more coffee and smoking is a risk factor for pancreatic cancer. These observational studies are often very difficult to unpick exactly what the causal relationship is.
Mendelian randomisation can replace a randomised control trial
- Example of using genetics to solve this problem = technique called Mendelian randomisation which can be used to replace RCT (gold standard)
- Association between high BMI and likelihood of Type 2 diabetes, but you want to understand if this is a causal relationship. Could it be confounded by lack of exercise, which we known is associated with both BMI and Type 2 diabetes?
- So to understand if BMI is directly causing an increased risk of type 2 diabetes then you could have a RCT where you have a bunch of people with high BMI and then a bunch of people with low BMI, keeping all other conditions exactly the same, to see who develops type 2 diabetes = very difficult to do!!
- So instead can identify genetic variants associated with BMI (around 100 genetic variants associated with BMI) which are not associated with exercise (confounder) so you can check if the genetic variants are directly associated with type 2 diabetes and if they are then it suggests that BMI is on the causal pathway to developing type 2 diabetes.
GWAS benefits: predicting who will get disease
- Heart disease has been one of the leaders of this (type 1 diabetes shown too)
- Can combine all the genetic variants each having a small affect into a polygenic risk score
Combining genetic variant into polygenic risk scores (PRS) looks particularly promising for cardiovascular disease
• Middle panel = Polygenic risk scores for a series of cases with coronary artery disease compared to control, the risk score is significantly higher for cases compared to controls
• Panel on right = shows individuals with different genetic risk scores, as you go along the X axis they have an increasing genetic risk for cardiovascular disease and the Y axis shows prevalence of the disease (proportion of the individuals who develop it)
o so as the genetic risk increases, the likelihood of them developing the condition also increases.
• People are born with their genetic component for heart disease so we can determine what individual risk is for heart disease
• Panel on left = As you plot it, it shows that if you’re in the top 5% of the population for genetic risk you have a fivefold increased risk of developing coronary artery disease
o This could be an opportunity for targeted intervention in people with the risks or adding to conventional risk factors
GWAS benefits: drug discovery
- Number of studies going backwards have shown that by looking at well-known drugs for certain disorders you can identify targets of the drugs by looking at the GWAS hits which have come up for those particular traits
- Drug companies can save a lot of money in drug development by using genetics to allow them to target particular regions.