Genetics of common disease Flashcards
What is the Variant frequency in cases vs controls?
Genetic variants, and those which are tightly linked to their region of the chromosome, are present at higher frequency in cases compared to controls.
Describe what a Mendelian disease is?
monogenic
clear inheritance pattern
minimal environmental influence
does not apply to common diseases or most phenotypic traits (e.g. height, high blood pressure, heart rate)
What is a Common disease?
multifactorial disease:
- multiple genes affect the disease/trait, with effect of each gene variant being very small/negligible
- strong influence from environment
E.g.:
- type II diabetes
- hypertension
- Alzheimer disease
What is Heritability?
measure of how well difference in people’s genes account for differences in their traits
Heritability close to 1 indicates…
almost all of the variability in a trait comes from genetic differences, with very little contribution from environmental factors
e.g. Cystic Fibrosis (heritability is 1)
How can we calculate heritability?
Using twin studies:
>Monozygotic twins have 100% DNA
>Dizygotic twins have 50% DNA
>Both share same environment, therefore any difference would be due to disease/trait
How are Twin Studies interpreted?
When looking at a trait e.g. height, measure height in both sets of twins and you would see their concordance. The higher the concordance, the more similar they are going to be. The more the trait is determined by a genetic contribution, the greater the difference in concordance because monozygotic twins share 100% DNA, whereas dizygotic twins share 50% DNA.
Once we’ve done our heritability study, we then need to identify which genes contribute to that trait.
How can we find out which genes contribute to a trait/disease?
through genetic association
-GWAS (genome wide association study)
What is the GWAS (genome wide association study)?
a method for identifying gene variants (SNPs) involved in complex diseases by using genetic markers scored for hundreds or thousands of individuals who have the disease (cases) and who do not have the disease (controls)
a typical GWAS study collects data to find out the common variants in a number of individuals, both with and without a common trait/disease, across the genome, using genome wide SNP arrays
Describe the process of SNP Microarray.
1) DNA Sample prepared and fragmented
2) DNA tagged/labelled with fluorescent probe
3) Mix DNA with the slide, which contains oligonucleotides which match the region of the genome around each variant being tested
4) If DNA sample contains a variant, then it binds to specific matching oligonucleotide and fluoresces
5) Signal produced which can be detected
Describe the process of Encoding SNP Chip data for analysis?
After we have our data from SNP Chip, it gets converted in a computer to a code. It works out what the genotype of individuals are and then converts them to a binary code that computer programs can deal with.
What are SNP chips
Rather than directly measuring genotypes at all genetic polymorphisms, we rely on the association between SNPs we do assay and SNPs we don’t assay
SNP-SNP association, or linkage disequilibrium (LD) is fundamental to our ability to sample the whole genome with relatively few SNPs
What is Linkage disequilibrium (LD)?
non-random association of alleles at two or more loci in a general population
linkage disequilibrium between two SNPs decreases with physical distance as more likely to have recombination between them
If LD is strong (chance of variants in that region inherited together is high), fewer SNPs are needed to capture variation in that region, therefore cheaper and easier/quicker to analyse
Where does most of the common variation occur?
most of the common variation occurs in the non-coding regions and often the causal variant is not included on the SNP chips so further work is required to narrow down the region of association and identify the causal variant
What analysis is carried out to indicate how likely a variant is to be associated with a trait?
statistical analysis (p-value indicates the significance of the association)
-lower p-value= more significant
all of the p-values for all of the SNPs on the chip are then plotted on a Manhattan plot
high peaks = high significance between a gene region and trait