Lecture 14: GWAS 2 Flashcards

1
Q

What are the potential sources of bias for GWAS?

A

Multiple testing, ill-defined sample size, population stratification, choice of case and controls, merging datasets and the need for genotype imputation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the issues surrounding multiple testing and the different corrections used. What is the point of doing this?

A

p value of 0.05 is usually used but because thousands of SNPs are tested the p value needs to be reduced as there would be lots of false positives. The bonferoni correction is where you divide 0.05 by the number of tests you are doing to get a new p value. The Benjamini Hochberg correction is where you arrange all the SNPs from smallest to largest p value and assess significance of each of them by doing the Bonferoni (n = 10 eg) and dividing the first by 10, second by 9 etc. Trying to maximise the chances of finding something whilst minimising the false positive rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is two stage GWAS used for and what is it?

A

the way to work around small sample sizes. First scan with a moderate sample size to identify areas of interest which may not reach statistical significance. Areas of interest are genotyped or sequenced in an independent data set to confirm associations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the problem of population stratification?

A

People associated with different regions are coming back as being highly associated with phenotypes. There is a possible correspondance between genetic and geographic distances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What geographical considerations have to be taken into account?

A

Some SNPs may have nothing to do with a particular disease but rather act as markers of an individual’s origin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What has to be matched properly? What is the consequence if they are not?

A

Cases and controls. There can be spurious associations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What can it be difficult to measure?

A

Some phenotypes eg mental health disorders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is genotype imputation?

A

The process of predicting or imputing genotypes that are not directly assayed in a sample of individuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is missing heritability? Give an example of this. What does this mean?

A

Most SNPs linked to a condition have low predicitng power with SNPs explaining less than 1% of variance. All SNPs identified and put together explain only a fraction of the heritable component. Breast cancer has a 30% genetic component but only 80% of this is explained by GWAS. There are additional things that we aren’t testing for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What can happen with associated SNPs?

A

They might be in linkage disequilibrium with the SNP actually causing the disease but which hasn’t been tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the alternatives to GWAS?

A

Copy number variations can be associated with disease though don’t explain the missing heritability.
Sequencing of extra genomes eg one project sequenced 5000 genomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the 10,000 genomes project?

A

full sequence of 4000 people from twins UK and ALSPAC of people with extreme obesity, neurodevelopmental disease and other conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain what exome sequencing is?

A

Involves sequencing exons from a whole genome. The DNA is shredded and segments containing exons are captured with probes. The exons are sequenced using next generation sequencing and align against the reference genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is good about exon sequencing?

A

Allows identification of very rare variants, can try to identify changes in copy number variation. Can discard the indels present in most of the population as are unlikely to be the cause of disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How was exome sequencing involved with miller syndrome?

A

First time it was used to identify phenotype associated mutations. Took 4 affected individuals from 3 different families and sequenced the coding regions. Miller syndrome was linked to mutations in DHODH gene. They sequenced this gene in 3 unrelated patients and they all carried mutations in it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why wouldnt this gene have been identified from GWAS?

A

All the SNPs were different in the gene.

17
Q

What are you limited to by looking only at mature RNA?

A

you only get the coding regions