Association Analysis Flashcards
describe genetic association
presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait
case-control genetic study
- large no. of well defined cases
- equal numbers of matched controls
- reliable genotyping technology
- standard statistical analysis
- positive associations should be replicated
genetic markers
diversity higher in population than in a single family
we need reliable genetic markers to capture genetic diversity
genetic markers are alleles that we can genotype and assess whether they are associated with disease
ideal genetic marker
polymorphic randomly distributed across genome fixed location in genome frequent in genome frequent in population stable with time easy to assay
single nucleotide polymorphism
common in genome = 1/300 nucleotides
12 millions SNPs identified in human genome
generated by mismatch repair during mitosis
where can SNPs be found
gene coding region
non coding region
intergenic region
define these:
synonymous = no amino acid change
non synonymous = amino acid change
nonsense = new stop codon
promoter = mRNA and protein level changed
terminator = mRNA and protein level changed
splice site = altered mRNA , altered protein
what are SNP MAFs
SNPs are chosen for genetic association studies on basis of their MAF
common diseases likely to be caused by common variants
SNPs with Maf > 5% are used in association studies
exceptions are known monogenic disease SNPs
GWAS results plot
- presented as a single graph called Manhattan plot
- all results plotted, typically for >1m SNPs
- x axis is position of the SNP on chromosome
- y axis is -log10 of the association
GWAS results
peak of association often does not identify the gene causing the disease
the peak identifies the genomic region associated with disease and is usually small than 100kb
describe meta analysis
difficult to do large studies
easier to combine smaller studies
pre experiment = consortium
post experiment = meta analysis
meta analysis allows statistical combination of results from multiple studies
problems with GWAS
contribution to genetic component of disease is estimated to be low, 5%
- SNPs small effect
- rare snps
- copy number variation
- epigenetic variation
is obesity strongly genetic
- twin studies = 70-80% of body shape genetically determined
- adoption studies = 30-40%
- family studies = 40-60%
large scale meta analysis
- BMI meta analysis in 322k subjects
- 97 BMI associated loci
- 125 separate studies
- > 600 authors and >2000 collaborators