Association Analysis Flashcards
What is genetic association?
The presence of a variant allele at a higher frequency in unrelated subjects with a particular disease cases compared to those that do not have the disease
What is a haplotype?
The order of alleles along a chromosome
What is an allele?
One form of a variant in the genome
What is a locus?
A position in the genome
What is a genotype?
Both alleles at a locus
What is a case control study?
→ Case group who all have the disease
→ Controls that match the people with the disease for non-disease traits such as ; age, location, ethnicity
What are the requirements for a good case control study?
→ Large numbers of well defined cases
→ Equal numbers of matched controls
→ Reliable genotyping technology
→ Standard statistical analysis
What extra step do you need to do when you find a positive association?
→ Replicate
→ To prove that it is not by chance
What are characteristics of an ideal genetic marker?
→ Polymorphic
→ Randomly distributed across the genome
→ Fixed location in the genome
→ Frequent in genome
→ Frequent in population
→ Stable with time
→ Easy to assay (genotype)
How often are SNPs found in the genome?
1 in every 300 nucleotides
How many SNPs have been identified in the genome?
12 million
How are SNPs formed?
The repair mechanism inserts a matching nucleotide to the wrong base so it is different from the original pair
What is the effect of SNPs found in the coding region?
→ no amino acid change (synonymous)
→ amino acid change (non-synonymous)
→ new stop codon
Where can SNPs be found in the non coding region and what is the effect?
→ Promoter - mRNA and protein level changed
→ Terminator - mRNA and protein level changed
→ Splice site - altered mRNA, altered protein
What do the major and minor allele frequency add up to?
1
What is a GWAS?
→ Genome wide association study
→ Association between disease and alleles of each marker - chi squared test
How is GWAS data represented?
A single graph called the Manhattan plot
What are the axes on a Manhattan plot?
→ X axis is the position of the SNP on the chromosome
→ Y axis is the -log10 (P value) of the association - done by chi squared
What does a peak on the Manhattan plot signify?
→ The peak does not identify the gene causing the disease
→ It identifies the genomic regions associated with disease and is smaller than 100kb
Why is the scale -log10 on the manhattan plot?
→ The probability of a result being due to chance is very high because there are so many samples
→ To produce a linear graph
What is meta analysis?
→ Allows the statistical combination of results from multiple studies
→ Increases statistical power
What % of body shape is genetically determined in twin studies?
70-80%
What is the gene associated with obesity?
FTO
What is the minimum accepted p-value for GWA significance?
p<5x10-8