association analysis Flashcards
What is genetic association?
- presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait.
- if we substitute the word disease for trait this is how we determine whether variants in the genome are associated with a disease (cases)
What are some examples of case -control in genetic studies?
=> cases are subjects with disease of interest, eg. obesity, schizophrenia, hypertension
=> definition of the disease must be applied in a rigorous and consistent way
=> controls must be as well matched as possible for non-disease traits
=> such as age, ethnicity, location, sex
What are stages of case control genetic study to identify disease genes?
- measure genetic loci of interest (affected cases)
- statistical analysis to determine which genetic loci are associated with disease
- identify genes /genomic regions
Case control genetic study must have:
- large numbers of well defined cases (10 000s)
- equal numbers of matched controls
- reliable genotyping technology (SNP microarrays)
- standard statistical analysis (PLNK)
- positive association should be replicated
what are genetic markers?
- individuals in a population are genetically far more diverse than individuals in a single family
- to capture this genetic diversity we need reliable genetic markers
- genetic markers are alleles that we can genotype and asses whether they are associated with disease.
What is an ideal genetic marker?
- polymorphic
- randomly distributed across the genome
- fixed location in genome
- frequent in genome
- frequent in population
- stable with time
- easy to assay (genotype)
what are SNP generated by?
- mismatch repair during mitosis, mismatch
- 12 million common SNPs identified in human genome
- common in the genome ~ 1/300 nucleotides
Where are SNPs located?
- Gene (coding region)
- no AAs change (synomous)
- AAs change (non- synonymous)
- new stop codon (nonsense) - gene (non coding region)
- promoter = mRNA and protein level changed
- terminator = mRNA and protein level changed
- splice site = altered mRNA, altered protein - intergenic region (98% of genome)
What do minor and major allele add up to?
Minor allele frequency + major allele frequency = 1.
- common diseases are likely to be caused by common variants
- SNPs with MAF >0.05 (5%) are usually used in association studies – GWAS
What do genome wide association study (GWAS) do?
- look for association between disease and alleles of each marker - chi -squared test
- by genotyping markers across the whole genome (using SNP microarrays)
- look for association between disease and alleles of each marker- chi squared test
- positive association is at p<5x10-8 (multiple testing correction)
How do you plot GWAS results?
- GWAS results are presented in Manahattan plot
- the peak association often does not identify the gene causing the disease, it identifies the genomic region associated with the disease and this is usually smaller than 100kb
- all data is plotted typically for >1M SNPs
- x axis is position of the SNP on the chromosome
- y axis is -log10 (p value) of association
What was the first GWAS study published?
- WTCCC
- manhattan plots if association of SNP markers with 7 diseases
- green peaks indicate significant p values
how did GWAS develop over the years?
- started around 2006
- each different colour represents different disease.
- got bigger over the years
- as meta analysis happens and results are pooled the number of results for different traits increases
- the empty areas are centromeres
What are problems with GWAS?
- GWAS has identified associations that are statistically strong and reproducible
- however, their contribution to the genetic component of disease is estimated to be low (<5%)
=>why?
-many common SNPs of very small effect - rare SNPS
-copy number variation
-epigenetic variation
Why is increasing obesity a problem?
leads to many medical complications such as:
- pulmonary disease
- non alcoholic fatty liver disease
- cancer
- gall bladder disease
- gynecologic abnormalities …