Analysis association Flashcards

1
Q

Define Gene association

A

Genetic Association is the presence of an allele at a higher frequency in unrelated subjects with a particular trait, compared to those that do not have the trait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we determine whether variants in the genome are associated with a disease?

A

If we substitute the word “disease” for trait” this is how we determine whether variants in the genome are associated with a disease

With disease = cases
Without disease = controls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In a case-control study when is there a disease present?

A

Gene is associated with disease as there are more cases than controls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 4 major rules of case-control studies?

A
  • Cases are subjects with the disease of interest, e.g. obesity, schizophrenia, hypertension
  • Definition of the disease must be applied in a rigorous and consistent way
  • Controls must be as well-matched as possible for non-disease traits
  • Such as age, sex, ethnicity, location, etc.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Using a simple flow map, how do we identify regions that are responsible for cause disease?

A

On image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we carry it out in practise?

A
  • Large numbers of well-defined cases (10 000s)
  • Equal numbers of matched controls
  • Reliable genotyping technology (SNP microarray)
  • Standard statistical analysis (PLINK)
  • Positive associations should be replicated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do we need reliable genetic markers?

A

• Individuals in a population are genetically far more diverse than individuals in a single family.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are genetic markers?

A

• Genetic markers are alleles that we can genotype and assess whether they are associated with disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define assoication

A

• Association means <100kb from a causal variant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the ideal genetic marker?

A
  • Polymorphic
  • Randomly distributed across the genome
  • Fixed location in genome
  • Frequent in genome
  • Frequent in population
  • Stable with time
  • Easy to assay (genotype)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an SNP?

A
  • Common in the genome ~1/300 nucleotides
  • ~12 million common SNPs identified in human genome
  • Generated by mismatch repair during mitosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How might an SNP arise?

A

On image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where are SNPS found?

A

• Gene (coding region)
No amino acid change (synonymous)
Amino acid change (non-synonymous)
New stop codon (nonsense)
• Gene (non-coding region)
Promoter – mRNA and protein level changed
Terminator - mRNA and protein level changed
Splice site – Altered mRNA, altered protein
• Intergenic region (98% of genome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is dbSNP?

A

The Single Nucleotide Polymorphism Database

Allows us to find information about SNPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the minor allele in dbSNP?

A

The less common allele, dbSNP allows us to see this in SNPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why are SNPS chosen?

A
  • SNPs are chosen for genetic association studies on the basis of their MAF
  • Common diseases are likely to be caused by common variants
  • SNPs with MAF >0.05 (5%) are usually used in association studies - GWAS
  • Exceptions are known monogenic disease
17
Q

What is GWAS?

A

Genome Wide Association Study (GWAS)
• Recruit large numbers of cases and controls
• Genotype markers across the whole genome
SNP Microarrays – see separate session
• Look for association between disease and alleles of each marker – chi-squared test
• Positive association is at p<5x10-8 (multiple testing correction)

18
Q

What does a GWAS give us in terms of results?

A

P value a value of confidence – measure of validity
Large numbers means more significant

Refer to table

19
Q

How do we plt the results of a GWAS?

What is the manhattan project?

A
  • GWAS results are presented as a single graph called a Manhattan plot
  • All results are plotted, typically for >1M SNPs
  • X-axis is position of the SNP on the chromosome
  • Y-axis is –log10(p-value) of the association

The Manhattan plot is a simple way to visualise the markers across the genome associated with the disease. The y-axis of the plot is the –log(base10) of the p-value, so if a marker is associated with disease with a p-value of 1x10-9 then the value on the y-axis for this would be 9. The x-axis is the location on the chromosome. Each chromosome is a different colour in the plot above and chromosome locations are given by the number of bases from the start of the chromosome sequence.

20
Q

What did the Wellcome Trust Case Control Consortium (WTCCC) – the first genetic wide association study in 2007 look at?

What were the results?

Have a look at the regional association plot, what does the red identify?

A

• Had a look at several diseases
On image

  • The peak of association often does not identify the gene causing the disease.
  • The peak identifies the genomic region associated with disease and this is usually smaller than 100kb.

This red SNP covers a few genes – but has a high significance – responsible for this peak

21
Q

What is a meta-analysis?

A

combine different studies:

• Difficult to do large studies (>1K cases/controls)
• Easier to combine smaller studies
Pre-experiment – Consortium
Post-experiment – Meta-analysis
Meta-analysis allows the statistical combination of results from multiple studies

22
Q

What are the problems with GWAS?

A

• GWAS has identified associations that are statistically strong and reproducible
• However, their contribution to the genetic component of disease is estimated to be low (<5%)
• Possible answers:
Many common SNPs of very small effect
Rare SNPs
Copy Number Variation
Epigenetic variation

23
Q

What are the medical implications of obesity?

A

On image

24
Q

Why is obesity strongly genetic?

A
•	Twin studies
 	70-80% of body shape is genetically determined
•	Adoption studies
 	30-40%
•	Family studies 
 	40-60%
25
Q

What did large scale meta-analysis reveal for obesity?

A
  • BMI meta-analysis in ~322k subjects
  • Locke et al (2015) Nature 518:197–206
  • 97 BMI-associated loci (associated with BMI that were statistically significant)
  • 125 separate studies
  • > 600 authors and >2000 collaborators
  • New loci are shown in red