Association analysis Flashcards

1
Q

Genetic association

A

The presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait

Gene variant is associated with disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Case-control study

A
  • Cases are subjects with the disease of interest, e.g. obesity, schizophrenia, hypertension
  • Definition of the disease must be applied in a rigorous and consistent way
  • Controls must be as well-matched as possible for non-disease traits
  • Such as age, sex, ethnicity, location, etc.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is needed for a case-control study?

A
  • Large numbers of well-defined cases (10 000s)
  • Equal numbers of matched controls
  • Reliable genotyping technology (SNP microarray)
  • Standard statistical analysis (PLINK)
  • Positive associations should be replicated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Genetic markers

A
  • Individuals in a population are genetically far more diverse than individuals in a single family.
  • To capture this genetic diversity we need reliable genetic markers
  • Genetic markers are alleles that we can genotype and assess whether they are associated with disease
  • Association means <100kb from a causal variant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Ideal genetic marker

A
  • Polymorphic
  • Randomly distributed across the genome
  • Fixed location in genome
  • Frequent in genome
  • Frequent in population
  • Stable with time
  • Easy to assay (genotype)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Single Nucleotide Polymorphism (SNP)

A
  • Common in the genome ~1/300 nucleotides
  • ~12 million common SNPs identified in human genome
  • Generated by mismatch repair during mitosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SNP formation

A
  • Gene (coding region)
  • No amino acid change (synonymous)
  • Amino acid change (non-synonymous)
  • New stop codon (nonsense)
  • Gene (non-coding region)
  • Promoter – mRNA and protein level changed
  • Terminator - mRNA and protein level changed
  • Splice site – Altered mRNA, altered protein
  • Intergenic region (98% of genome)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SNP MAF

A
  • SNPs are chosen for genetic association studies on the basis of their MAF
  • Common diseases are likely to be caused by common variants
  • SNPs with MAF >0.05 (5%) are usually used in association studies - GWAS
  • Exceptions are known monogenic disease SNPs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

GWAS

A

Genome Wide Association Study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

GWAS Process

A

• Recruit large numbers of cases and controls
• Genotype markers across the whole genome
Look for association between disease and alleles of each marker – chi-squared test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

GWAS results

A
  • GWAS results are presented as a single graph called a Manhattan plot
  • All results are plotted, typically for >1M SNPs
  • X-axis is the position of the SNP on the chromosome
  • Y-axis is –log10(p-value) of the association
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

WTCC

A

Manhattan Plots of association of SNP markers with seven diseases
Green peaks indicate significant p-values
• The peak of association often does not identify the gene causing the disease.
• The peak identifies the genomic region associated with disease and this is usually smaller than 100kb.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Meta-analysis

A

Difficult to do large studies (>1K cases/controls)
Easier to combine smaller studies
• Pre-experiment – Consortium
• Post-experiment – Meta-analysis
Meta-analysis allows the statistical combination of results from multiple studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Problems with GWAS

A
  • GWAS has identified associations that are statistically strong and reproducible
  • However, their contribution to the genetic component of disease is estimated to be low (<5%)
  • Possible answers:
  • Many common SNPs of very small effect
  • Rare SNPs
  • Copy Number Variation
  • Epigenetic variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Obesity is strongly

A
Twin studies
•	70-80% of body shape is genetically determined
Adoption studies
•	30-40%
Family studies
•	40-60%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Large scale Meta-analysis for obesity

A
•	BMI meta-analysis in ~322k subjects
-	Locke et al (2015) Nature 518:197–206
•	97 BMI-associated loci
•	125 separate studies
•	>600 authors and >2000 collaborators