Association analysis Flashcards

1
Q

What is genetic association?

A

presence of a variant allele at a higher frequency in unrelated subjects with a particular disease of interest (cases) compared to those that do not have the disease (controls)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are cases?

A

subjects with the disease of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the controls in association analysis?

A

must be identical to out cases APART from not having the disease

e.g. same age, sex, ethnicity, location etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Difference in variant frequency between cases and controls

A

in cases, the gene variant is at a higher frequency than in the controls and is associated with the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What confirms the strength of association between a gene variant and the disease?

A

Statistics e.g. p-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do quality case control genetic studies require?

A
  • large numbers of well defined cases (1000s)
  • equal numbers of matched controls
  • reliable genotyping technology (SNP array)
  • standard statistical analysis
  • positive associations should be replicated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What allows us to capture genetic diversity?

A

The use of genetic markers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give some Features of an ideal genetic marker (e.g. SNP).

A
  • polymorphic
  • randomly distributed across the genome
  • fixed location in genome
  • frequent in the genome
  • frequent in the population
  • stable with time
  • easy to assay (genotype)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are SNPs generated?

A

though mismatch repair during DNA replication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the possible locations for SNPs?

A

Gene (coding region)

  • no amino acid change (synonymous)
  • amino acid change (non-synonymous)
  • new stop codon (nonsense)

Gene (non-coding region)

  • promoter: mRNA and protein changed
  • terminator: mRNA and protein changed
  • splice site: altered mRNAm altered protein

Intergenic Region
-98% of SNPs in this region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is dbSNP?

A

online database of SNPs and multiple small-scale variations that include insertions/deletions, microsatellites and non-polymorphic variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Minor allele frequency (MAF)?

A

the frequency of the less common variant in a population

SNP will have two alleles:

  • major allele
  • minor allele
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the sum of the major and minor allele frequencies?

A

Major Allele Frequency + Minor Allele Frequency= 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the Genome Wide Association Studies (GWAS)?

A

Studies of variations in the entire human genome to identify associations between variations in genes and particular behaviours, traits, or disorders.

SNP markers are used across the whole genome, and these are genotyped using SNP microarrays.

We look for association between disease and each marker by doing a chi-square test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain the steps of GWAS.

A

Obtain DNA from people with disease of interest (cases) and unaffected controls

Run each DNA sample on a SNP chip to measure genotypes at 300,000-1,000,000 SNPs in cases and controls

Identify SNPs where one allele is significantly more common in cases than controls
-the SNP is associated with disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is GWAS data presented?

A

on a Manhattan plot:
The end result from GWAS that shows the score for each SNP marker tested in the study, across each chromosome in the genome - a peak indicates that the location of that particular SNP may be very close to a disease-related locus - low p-value = high peak –> reject null hypothesis for that SNP - shows evidence of correlation of each marker tested with a disease phenotype across the genotype

*highest point is MOST significant SNP with disease state

X-axis is position of SNP on chromosome

Y-axis is -log10(p-value) of the association between each marker and the disease, calculated using a chi-squared test

17
Q

Why are chromosomes more solid at the bottom of the Manhattan plot?

A

because the vast majority of SNPs are not associated with disease

*associations are further up the graph

18
Q

What is the Wellcome Trust Case Control Consortium?

A

genome-wide association study of 14,000 cases of 7 common diseases and 3,000 shared controls

  • looks for specific genes
  • look at p-value (the smaller means stronger association)
    ex) LDLR, SORT1 for lipids and CAD
19
Q

What does the peak on a Manhattan plot show?

A

The peak does not identify the gene causing the disease, instead it only identifies the genomic REGION associated with the disease and this is usually very small (<100kb)

-if you zoom in on an SNP with strong disease association (high up in plot), you will find lots of other associated SNPs in the adjacent gene and therefore more work has to be done to work out which SNP is correct and where the actual disease-causing variant is

20
Q

Give some Disadvantage of GWAS.

A

Difficult to do very large association studies (>1K cases), and therefore meta-analysis of GWAS is done to combine statistical results from multiple smaller studies

Pre-experiment: consortium
Post-experiment: meta analysis

21
Q

Give the Medical Complications of Obesity.

A
Pulmonary Disease
Non-alcoholic fatty liver disease
Gall bladder disease
Osteoarthritis
Skin
Gout
Phlebitis
Cataracts
Coronary Heart Disease
Cancer
Severe Pancreatitis
Stroke
Idiopathic intracranial hypertension
Gynaecological Abnormalities
22
Q

Give some of the problems of GWAS.

A

Expensive

requires large sample size,

only explains a small part of the disease phenotype, meaning their contribution to the genetic component of the disease is estimated to be low (<5%), could be because:

  • many common SNPs of small effect
  • rare SNPs
  • don’t look at copy number variation or epigenetic variation
23
Q

How are genes associated with obesity in GWAS?

A

Obesity is strongly genetic

Though GWAS we can clearly see genes associated with obesity