Association Analysis Flashcards
What is genetic association?
Genetic association is the presence of a variant allele at a higher frequency in unrelated subjects with a particular disease (cases), compared to those that do not have the disease (controls).
For disease, we could use the broader term ‘trait’, for example, height is not a disease.
What are case control studies?
Cases are subjects with the disease of interest, (e.g. obesity, schizophrenia, hypertension).
The definition of the disease must be applied in a rigorous and consistent way.
The controls must be as well-matched as possible for non-disease traits. This could be for sex, age, ethnicity, location, etc. Thus, they must be identical to the cases, but just not have the disease. This is, of course, not easy because there are many outlying factors.
How would you use case control studies for genetic association?
- You would match the affected cases and unaffected controls for all the other risk factors.
- You would measure the genetic loci of interest.
- You would perform a statistical analysis to determine which genetic loci correlate with each disease.
- You would identify the genomic region associated with the disease.
What are some features of the best case control studies?
- large numbers of well-defined cases (1000s)
- equal numbers of matched controls
- reliable genotyping technology (SNPs are used)
- standard statistical analysis (PLINK is used)
- positive associations should be replicated
Why do we need so many genetic markers?
Individuals in a population are genetically far more diverse that individuals in a single family.
To capture this genetic diversity, we need to use 100,00s or millions of genetic markers.
What would be some features of the ideal genetic marker?
- polymorphic
- randomly distributed across the genome
- fixed location in genome
- frequent in genome
- frequent in population
- stable with time
- easy to assay (genotype)
What is a single nucleotide polymorphism (SNP)?
It is common in the genome (~1/300 nucleotides). There have been around 12 million common SNPs identified in the human genome.
They are generated by mismatch repair during mitosis.
Where can you find SNPs?
SNPs may be in a: GENE (CODING REGION): - no amino acid change (synonymous) - amino acid change (non-synonymous) - new stop codon (nonsense)
GENE (NON-CODING REGION):
- promoter: mRNA and protein level changed
- terminator: mRNA and protein level changed
- splice site: altered mRNA and altered protein level
INTERGENIC REGION (98% of SNPs)
What is the purpose of minor allele frequency (MAF)?
To check if an SNP is frequent or not, you will use minor allele frequency (the less common allele is called the ‘minor’ allele).
What is genetic association?
Genetic association is the presence of a variant allele at a higher frequency in unrelated subjects with a particular disease (cases), compared to those that do not have the disease.
How would we use genome-wide association studies (GWAS)?
We would use markers across the whole genome (such as SNP microarrays). We would then look for associations between the disease and each marker (we use a chi-squared test for statistical significance).
This has resulted in the detection of large numbers of disease-associated genes.
How are GWAS results presented?
GWAS data is presented a single graph called a Manhattan plot.
The x-axis is the position of the SNP on the chromosome (showing position from chromosome 1 to 22). The y-axis is the -log10(p-value) of the association. This makes the results easier to compare.
The colours look solid for the most part of the graph because most SNPs are not associated with a disease.
What do the results of the GWAS tell us?
The strong peaks show lots of associated SNPs, all in one genomic locus.
The peak does not identify the gene as causing the disease. The peak only identifies the genomic region associated with the disease.
What is a major benefit of meta-analyses?
It would be difficult to do very large studies (>10K cases).
It would instead be easier to combine smaller studies’ results:
- pre-experiment: consortium
- post-experiment: meta-analysis
A meta-analysis allows the statistical combination of results from multiple studies.
List some secondary characteristics of obesity.
- pulmonary disease
- non-alcoholic fatty liver disease
- gall bladder disease
- gynaecological abnormalities
- osteoarthritis
- skin problems
- gout
- idiopathic intracranial pressure
- cataracts
- coronary heart disease
- severe pancreatitis
- cancer
- phlebitis