Association Analysis Flashcards
What is genetic association?
The presence of a variant allele at a higher frequency in unrelated subjects with a particular disease (cases), compared to those that do not have the disease (controls).
What is an allele?
One form of a variant in the genome
What is a locus?
A position in the genome
What is a genotype?
Both alleles at a locus e.g. locus 1: 1,4 and Locus 2: 1,1
What is a haplotype?
This is the order of alleles along a chromosome
Why are case-control studies used?
- Cases are subjects with the disease of interest e.g. obesity, schizophrenia, hypertension.
- Defintion of the disease must be applied in a rigorous and consistent way
- Controsl must be as well-matched as possible for non-disease traits such as age, sex, ethnicity, location etc
What is case-control association?
Cases: gene variant is associated with disease
versus controls
Describe how the case control study works
There are two groups:
- Affected cases
- Unaffected controls
Then measure the genetic loci of interest
Statistical analysis to determine which genetic loci correlate with disease
Identify genomic region associated with disease
What is needed in a case-control genetic study?
- Large number of well-defined cases
- Equal numbers of matched controls
- Reliable genotyping technology (SNP array)
- Standard statistical analysis (PLINK)
- Positive associations should be replaced
What is the ideal genetic marker?
- Polymorphic
- Randomly distributed across the genome
- Fixed location in genome
- Frequent in genome
- Frequent in population
- Stable with time
- Easy to assay (genotype)
What is a SNP?
- Generated by mismatch repair during mitosis
- Common in the genome which is about 1/300 nucleotides
- About 12 million common SNPs identified in human genome
How do SNPs arise?
- DNA strands are split and they undergo mitosis.
- One DNA strand replicates
- The other DNA strand replicates but there is a mismatch.
- Usually it would be repaired by the mismatch repair system.
- Rather than the mismatch repair system replacing the mismatch, it replaces the other base on the original strand.
- This because the SNP; T/C SNP.
Where are SNPs located?
In the Gene coding region:
- No amino acid change (synonymous)
- Amino acid change (non-synonymous)
- New stop codon (nonsense)
In the Gene non-coding region:
- Promoter - mRNA and protein level changed
- Terminator - mRNA and protein level changed
- Splice site - altered mRNA, altered protein
In the intergenic region
What is the dbSNP?
It is an online database at NCBI of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and non-polymorphic variants.
What is the minor allele?
It is the less common alllele. Each allele has a frequency in the general population and the minor allele has a MAF.