Association Analysis Flashcards
Define the term “Genetic Association”
The presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait
How do we determine whether variants in the genome are associated with a disease?
If we substitute the word “disease” for “trait”.
- With disease = cases
- Without disease = Controls
Describe how a genetic association study is conducted
- Cases are subjected with the disease of interest E.g obesity
- Definition of the disease must be applied consistently
- Controls must be as well (similarly) matched as possible for the non disease traits
- such as age, sex, ethnicity, location
What other factors need to be catered for?
Match for all other risk factors
- affected/unaffected cases
- Measure genetic loci of interest
- Statistical analysis which genetic loci are associated with disease
- Identify genes/genomics region
How would you make the study fair?
- Use a large number of well defined cases
- use an Equal number of matched controls
- Reliable genotyping technology (SNP microarray)
- Standard statistical amalysis (PLINK)
- Positive associations should be replicated
How does using a genetic marker fit well in a genetic association study?
Individuals in a population are genetically far more diverse than individuals in a single family
How is this genetic diversity captured?
- This genetic diversity is captured through reliable genetic markers
- Genetic markers are alleles that we can genotype and assess whether they are associated with the disease
- Association means <100kb from a casual variant
What is the ideal Genetic Marker?
If it is
- Polymorphic
- Randomly distributed across the genome
- Fixed location in genome
- Frequent in genome
- Frequent in population
- Stable with time
- Easy to assay
How is a Single Nucleotide Polymorphism (SNP) used in a genetic association study?
- Common in the genome ~1/300 nucleotides
- 12 million common SNPs identified in human genome
- Generated by mismatch repair during mitosis
Where could SNP’s be found?
In the gene (Coding region)
- No amino acid change (synonymous)
- Amino acid change (non-synonymous)
- New stop codon (nonsense)
Where else could SNP’s be found?
In the gene (Non coding region)
- Promoter: mRNA and protein level changed
- Terminator: mRNA and protein level changed
- Splice site: Altered mRNA, altered protein
Could also be found in the intergenic region (98% of genome)
Describe what a dbSNP is
- An online database at NCBI, database of SNP’s
- The rs number is a unique identifier given to each SNP
- Has two unique flanking sequences between a single polymorphism
Describe what a minor Allele Frequency is (MAF)
SNP’s have two forms. The major and minor form.
- The less common allele is called the minor allele
- Major allele frequency + Minor allele frequency = 1
Why are SNP’s chosen for genetic association studies?
- SNP’s are chosen on the basis of their MAF
- Common diseases are likely to be caused by common variants
- SNPs with MAF >0.05 (5%) are usually used in association studies - GWAS
- Exceptions are known monogenic disease SNPs
How is a Genome Wide Association Study carried out?
- Recruit large numbers of cases and controls
- Genotype markers across the whole genome
- Look for association between disease and alleles of each marker (Chi squared test)
- Positive association is at p<5x10-8 (multiple testing correction)