How to handle the data from studies of complex disease Flashcards

1
Q

What does parametric linkage analysis determine?

A

Genetic determinants of disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How are parametric linkage analysis set up?

A

Ascertain (a small set of) large families (pedigrees) each containing a number of affected individuals

Use a genotyping technique to measure the alleles (genotype) at one or more loci, in as many individuals as are available

Examine the co-segregation (co-transmission) of disease phenotype and alleles at the genetic marker loci

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is genetic distance measured in?

A

Morgans (M) or centimorgan (cM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the connection between Morgans and recombination?

A

Recombination between alleles at two loci closely related to physical distance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What symbol represents the probability of recombination between loci?

A

θ (Theta)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the ranges of θ?

A

0 to 0.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the value of θ when the loci lies close?

A

θ is small (≈0) and the loci are said to be completely linked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the value of θ when the loci are further apart?

A

θ approaches 0.5

Loci are said to be unlinked (alleles at the two loci are transmitted independently)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Likelihood ratio test?

A

Using a computer program to calculate the likelihood of observed genotype and phenotype data in a set of families.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the likelihood ratio depend on?

A

How well the observations match the assumed model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a LOD score?

A

Testing for linkage using likelihood ratio test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the LOD score test for?

A

Tests the null hypothesis that the disease locus lies far away from the genotyped marker locus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the null hypothesis in a LOD score test?

A

θ = 0.5 (unlinked)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to calculate parametric linkage analysis (likelihood ratio)?

A

LRmax = L(θˆ) / L(0.5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is L(θˆ)?

A

The value of θ that maximises the likelihood (makes the data ‘most likely’ to have occurred).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to calculate the LOD score based on the likelihood ratio?

A

The log base 10 of the likelihood ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is considered a “Convincing” LOD score as evidence for linkage?

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why is 3 a “Convincing” LOD score?

A

Corresponds to a likelihood ratio of 1000

Data is 1000 times more likely under the alternative hypothesis than under the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do we find the max LOD score?

A

Multipoint analysis.

We calculate the likelihood (or likelihood ratio) at different values of θ.

20
Q

How is multipoint analysis carried out in theory?

A

Use a set of marker loci whose genetic map positions are known, and assess the evidence
for the disease locus lying at different positions along the genetic map.

21
Q

What does the LOD score at each position in a multipoint analysis correspond to?

A

The likelihood of the data assuming the disease
locus lies at that position divided by the likelihood of the data assuming the disease locus lies far away.

22
Q

How is multipoint analysis carried out in practice?

A

Computer program.

23
Q

What kinds of programmes carry out multipoint analysis?

A

Merlin (smallish pedigrees, exact calculation)

SIMWALK or MORGAN (larger pedigrees, approximate calculation)

24
Q

What happens once you have your LOD score graph?

A

You keep going smaller and smaller till you can pin point.

25
Q

What happens when a disease is heterogenetic?

A

Only a proportion (α) of families assumed to show linkage.

26
Q

What is HLOD score?

A

When a disease is heterogenetic, α is estimated along with θ by maximum likelihood

27
Q

How successful have parametric linkage analysis studies been for monogenic disease?

A

Highly successful

28
Q

How successful have parametric linkage analysis studies been for complex disease?

A

Less successful

29
Q

What is the purpose of non-parametric linkage analysis?

A

Tries to determine whether members of a family with “similar” trait values tend to share genetic material in common from their common ancestors.

29
Q

What are the aims of association studies?

A

Directly examine the association (correlation) between alleles present at a genetic locus and a phenotype of interest.

29
Q

What is the most popular type of association studies?

A

Case/control study (unrelated individuals)

30
Q

How are association studies set up?

A

Collect sample of affected individuals (cases) and unaffected individuals (controls)

Examine the correlation between alleles present at a genetic locus and presence/absence of disease by comparing the distribution of genotypes in affected individuals with that seen in controls.

31
Q

Why are parametric linkage analysis more difficult for heterogenic diseases?

A

Can’t assume all family have the same cause and therefore the same gene locus.

32
Q

How to test for association (correlation) between genotype and presence/absence of disease when doing case/control studies?

A

Using standard χ2 test for independence on 2 df.

33
Q

What is the χ2 test for independence?

A

(Observed −Expected )^2 / Expected + p value

34
Q

What is the more sophisticated to preform an association test?

A

Rearrange your data to test specifically for dominant or recessive effects.

Use linear regression for quantitative outcomes

Use an x variable defined according to genotype

35
Q

What is the null hypothesis for linear regression of an association test?

A

Slope = 0

36
Q

What are FBATs?

A

Family-based association tests

37
Q

What are TDT?

A

The transmission disequilibrium test.

38
Q

What are LMMs?

A

Linear mixed models

39
Q

How to analyse family based data?

A

Use family-based association tests (FBATs)?

40
Q

What kinds of family-based association tests (FBATs) are there?

A

The transmission disequilibrium test (TDT)

Linear mixed models (LMMs)

41
Q

What kinds of software analyse GWAS?

A

PLINK, SNPTEST, GCTA

42
Q

Why are stringiest significance levels required during GWAS?

A

To overcome the multiple testing problem incurred when we test many SNPs throughout the genome.

43
Q

What quality control is required when using GWAS?

A

Discard samples (people) deemed unreliable

Discard data from SNPs deemed unreliable

44
Q

What could make a sample be deemed unreliable?

A

Low genotype call rates (unsuccessful genotyping)

Excess heterozygosity (mix of samples)

Gender and Ethincity

45
Q

What could make a SNPs be deemed unreliable?

A

On basis of genotype call rates,

Mendelian mis inheritances,

Hardy-Weinberg disequilibrium

Exclude SNPs with low minor allele frequency (MAF), these are hard to compare to the control.