Week 5 Flashcards

0
Q

With the depression study comparing QQ plots revealed that the SNPs are there and they are each having a tiny effect, but not reaching genome-wide significance. How can we get at these SNPs and see if they make a real contribution?

A

We can use polygenic risk scoring (although this tends to be used when genome wide significance is found it can be used when non significant associations are found)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Who carried out the large GWAS study into schizophrenia in 2014? And how big was the sample and how many hits were reported?

A

Ripke et al., (2014)

35000 participants

108 hits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is polygenic risk scoring?

A

Using SNPs that are thought to be linked with a phenotype

Genotype some individuals for the SNPs

And then add their number of the risk alleles to see how much at risk they are.

E.g.

SNP 1 (A is the risk allele)

Person 1 has AA (2)
2 has TT (0)
3 has TA (1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can you do if the SNPs investigated using polygenic risk scoring have different effect sizes?

A

You can use the odds ratio to calculate the accurate polygenic risk scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the steps for polygenic risk scoring (5 steps) when using genome-wide significant hits

A
  1. Do a GWAS or get the results from one
  2. Select a set of SNPs based on the p value
  3. Do a GWAS in an independent sample
  4. Calculate the polygenic risk score
  5. Test the polygenic risk score as a predictor of an outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When calculating a polygenic risk score using hits that did not reach genome-wide significance what p value should be used?

A

There is no answer… Yet and it is likely to differ from different disorders

Try using multiple thresholds

E.g may be p < 0.001, p < 0.01 etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What has linkage disequilibrium have to do with choosing the number of SNPs to use when calculating a polygenic risk score?

A

Close by SNPs are often correlated with one another (and are inherited together via linkage disequilibrium)

There may be whole climbs of SNPs that are associated with depression (example)
But this could be driven by one SNP.

We could inflate out scores by counting each one

Therefore pruning based on linkage disequilibrium is crucial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When running a GWAS in an independent sample what is the population called?

A

Target population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What should the target population be in an independent GWAS sample for calculating a polygenic risk score?

A

It should be as similar as possible to the discovery data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Oh no! Your precious GWAS study used an old microarray chip to genotype the discovery data in your polygenic risk score analysis. What can you do?

A

You can impute any SNPs not genotyped in the target population
Imputation takes advantages of linkage disequilibrium to fill in the gaps in our sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What program can solve array mismatch in a discovery and target populations in a polygenic risk score analysis?

A

Programs such as impute

It will also give you a level of certainty of the imputation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we calculate the individual polygenic risk score?

A

Number of risk alleles weighted by effect size

Score = SUM (risk alleles * log odds)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can we test the polygenic risk score as a predictor of outcome?

A

We can do a logistic regression with cases and controls

  • remembering to covary for population stratification

This means we can get measures of the variance explained in terms of r2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

List four other things that can be done with polygenic risk scoring?

A

Testing the generalist gene hypothesis

Cross trait/disorder analyses

Gene-environment correlations

Gene-environment interactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Using polygenic risk scoring how much of the variance was managed to be explained in depression?

A

About 1.4%!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the generalist genes hypothesis?

A

That the same genes influence learning abilities and disabilities

16
Q

How can polygenic risk scoring investigate the generalist genes hypothesis?

A

Discovery data compared those within very high and those with very low mathematical abilities

A polygenic risk score explained variance across the entire spectrum of mathematical abilities

17
Q

What has polygenic risk scoring found out about the link between schizophrenia and psychotic like experiences?

A

That there is little support for the hypothesis that psychotic experiences in population based samples of adolescents share a comparable genetic architecture to schizophrenia

18
Q

How can polygenic risk scoring investigate cross trait/disorders?

A

By observing the polygenic risk scores on one trait and see if they can predict another trait (this would suggest shared aetiology)

E.g. Polygenic risk score of schizophrenia predicts bipolar disorder

But not ASD or other non-psychiatric phenotypes

(This findings have been previously found in multivariate twin analysis)

However quite surprising the same polygenic risk score also predict ADHD

19
Q

Polygenic risk scores of schizophrenia also predict cannabis use - what does these findings suggest?

A

A gene-environment correlation

20
Q

Explain a polygenic risk scoring analysis that shows a gene-environment interaction for obesity

A

The polygenic risk score for obesity predicts obesity in separate samples but these effects are moderated by environment (active or inactive environment)

More this can also be done with depression but the effects are largest in those with childhood maltreatment

21
Q

Is polygenic risk scoring helpful for predicting a disorder?

A

It can but the variance explained
Is still very small - the missing heritability is still lot found

The actual predictive power is very limited - not clinically useful.

22
Q

Can polygenic risk scoring be helpful for looking at the shared genetic architecture of disorders?

A

Yes and some Surprising sharing has been found to go on.

23
Q

What was genome-wide complex trait analysis (GCTA) developed for?

A

It was developed to estimate the heritability of height explained by SNPs

The report (yang et al.,) suggested that the missing heritability wasn’t “missing” but hiding in variants with small effect sizes

24
Q

How does GCTA work?

A

If a trait is genetically influenced, then individuals who are more genetically similar should be more phenotypically similar.

Two main steps
Work out
1. Genetic similarity - similarity across SNPs

  1. See how much this similarity explains the similarities in their phenotypes using a mixed model.
25
Q

How can you work out genetic similarity in a GCTA?

A

Calculate the pairwise relatedness between each individual in the dataset.

Each individual now has a GRM (genetic related matrix) which describes their relationship to everyone else (e.g. A sample of 100 would have 100 matrices each of 1X100 size)

26
Q

When should individuals be excluded from the analysis?

A

Exclude one from each park of individuals who share >2.5% of their segregating genes.

27
Q

What sort of model is used to analyse the data from a GCTA study?

A

A random effects model fitter with restricted maximum likelihood (RRML) to estimate the genetic effects on the phenotype

28
Q

What results do GCTA studies find for heritability of physical and cognitive traits compared to twin analysis?

A

Slightly lower but much more consistent than GWAS (e.g height = .5 and .8)

29
Q

What about GCTA and psychopathologies?

A

Promising but estimates are still low (schizophrenia = .2 when it should be .8)

30
Q

How about GCTA and behavioural traits?

A

Very poor. Nothing for depression (less than .05 for anxiety compared to .4)

31
Q

Why may behavioural traits still not be finding results?

A

More Non additive genetic variants (masked by twin inflation), rate variants, gene-environment interactions

Plus for cognitive there may be more additive genetic variants.

32
Q

What else can GCTA do? (4 points)

A

Estimate SNP heritability where there are no twin studies

Cross trait/disorder analyses

Gene-environment correlation

Gene-environment interaction

33
Q

Give an example of a GCTA study looking into SNP heritability where there was no twin studies?

A

Responses to antidepressants considered to be effected by genetic variation

There are family studies but there are no twin studies

Research found that common genetic variants explain 42% of individual differences in antidepressant response.

34
Q

What are some of the limitations of GCTA?

A
  • very sensitive to population stratification and relatedness

Still doesn’t capture rare variants and variants not captured by SNPs

Assumes that all variants have an equal effect and are additive

Are GCTA results biased? Only time will tell.

35
Q

Discuss GCTA and gene-environment interaction

A

We can model gxe (relatedness by environmental interaction) in the random effects portion of the model)

Some promising results but still early stages

36
Q

How does related individuals being included into GCTA bias the results?

A

As GCTA wishes to gain an estimate of the genetic variance captured by SNPs - if individuals are related phenotypic correlations driven by shared environmental factors may inflate the genetic variance estimates.

37
Q

What does GRM stand for?

A

Genetic relationship matrix