association analysis Flashcards

1
Q

What is genetic association?

A
  • presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait.
  • if we substitute the word disease for trait this is how we determine whether variants in the genome are associated with a disease (cases)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are some examples of case -control in genetic studies?

A

=> cases are subjects with disease of interest, eg. obesity, schizophrenia, hypertension
=> definition of the disease must be applied in a rigorous and consistent way
=> controls must be as well matched as possible for non-disease traits
=> such as age, ethnicity, location, sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are stages of case control genetic study to identify disease genes?

A
  1. measure genetic loci of interest (affected cases)
  2. statistical analysis to determine which genetic loci are associated with disease
  3. identify genes /genomic regions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Case control genetic study must have:

A
  • large numbers of well defined cases (10 000s)
  • equal numbers of matched controls
  • reliable genotyping technology (SNP microarrays)
  • standard statistical analysis (PLNK)
  • positive association should be replicated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are genetic markers?

A
  • individuals in a population are genetically far more diverse than individuals in a single family
  • to capture this genetic diversity we need reliable genetic markers
  • genetic markers are alleles that we can genotype and asses whether they are associated with disease.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an ideal genetic marker?

A
  • polymorphic
  • randomly distributed across the genome
  • fixed location in genome
  • frequent in genome
  • frequent in population
  • stable with time
  • easy to assay (genotype)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are SNP generated by?

A
  • mismatch repair during mitosis, mismatch
  • 12 million common SNPs identified in human genome
  • common in the genome ~ 1/300 nucleotides
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Where are SNPs located?

A
  1. Gene (coding region)
    - no AAs change (synomous)
    - AAs change (non- synonymous)
    - new stop codon (nonsense)
  2. gene (non coding region)
    - promoter = mRNA and protein level changed
    - terminator = mRNA and protein level changed
    - splice site = altered mRNA, altered protein
  3. intergenic region (98% of genome)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do minor and major allele add up to?

A

Minor allele frequency + major allele frequency = 1.

  • common diseases are likely to be caused by common variants
  • SNPs with MAF >0.05 (5%) are usually used in association studies – GWAS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do genome wide association study (GWAS) do?

A
  • look for association between disease and alleles of each marker - chi -squared test
  • by genotyping markers across the whole genome (using SNP microarrays)
  • look for association between disease and alleles of each marker- chi squared test
  • positive association is at p<5x10-8 (multiple testing correction)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you plot GWAS results?

A
  • GWAS results are presented in Manahattan plot
  • the peak association often does not identify the gene causing the disease, it identifies the genomic region associated with the disease and this is usually smaller than 100kb
  • all data is plotted typically for >1M SNPs
  • x axis is position of the SNP on the chromosome
  • y axis is -log10 (p value) of association
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What was the first GWAS study published?

A
  • WTCCC
  • manhattan plots if association of SNP markers with 7 diseases
  • green peaks indicate significant p values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how did GWAS develop over the years?

A
  • started around 2006
  • each different colour represents different disease.
  • got bigger over the years
  • as meta analysis happens and results are pooled the number of results for different traits increases
  • the empty areas are centromeres
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are problems with GWAS?

A
  • GWAS has identified associations that are statistically strong and reproducible
  • however, their contribution to the genetic component of disease is estimated to be low (<5%)
    =>why?
    -many common SNPs of very small effect
  • rare SNPS
    -copy number variation
    -epigenetic variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is increasing obesity a problem?

A

leads to many medical complications such as:

  • pulmonary disease
  • non alcoholic fatty liver disease
  • cancer
  • gall bladder disease
  • gynecologic abnormalities …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do twin studies show obesity to have a strong genetic link?

A
  • twin studies
  • monozygotic twins (share both genes and environment)
  • dizygotic twins share the environment only
  • concordance of similarity between two monozygotic twins demonstrates genetic effect (70 -80% of body shape is genetically determined)
  • adoption studies (share genes but not environment , 30-40% similarity)
  • family studies 40-60%
  • evidence for genetics > environment
17
Q

What are features of large scale meta analysis on obesity?

A
  • BMI meta-analysis ~ 322K subjects
    -97 BMI- associated loci
  • 125 separate studies
    ->600 authors and >2000 collaborators
    => novel loci = red
    => known loci = blue
18
Q

What does obesity GWAS result venn diagram show?

A
  • some traits overlap

- gives idea of interrelationship of different traits

19
Q

What is linkage disequilibrium (LD)

A

When two alleles are inherited together more often than expected due to chance
This is usually because they are close together in the genome
Alleles that are physically close together (within 100kbs) are rarely separated by recombination so genetic marker is found in disease allele.