Lecture 9 Flashcards
What is a type I error?
(alpha - related to significance level) When Ho is true but it is rejected, giving a false positive.
What is a type II error?
(Beta) when Ho is false but it is accepted, giving a false negative.
How can we control the number of type 1 errors?
By controlling out significance level alpha!
What are the two main groups of methods for accounting for multiple tests?
Adjustment of p-values and determining p-values
What is involved with adjustment of p-values?
Controlling family wise type 1 error rates (FWER)
Controlling false discovery rate (FDR)
What is involved with determination of p-values?
Permutation test
Define Per-comparison error rate in equation form.
PCER=E(V)/m
Define family-wise error rate in equation form.
FWER=P(V≥1)= 1-P(V=0)
Define False discovery rate in equation form.
FDR=E(V/R|R>0)*P(R>0)
Define Proportion of false positives
PFP=E(V)/E(R)
Which ‘rate’ is related to all tests?
FWER
What ‘rate’ is related to only rejected hypotheses?
FDR
Define single step procedures.
Equivalent adjustments are performed for all hypotheses
What are examples of single step procedures?
Bonferroni and sidak adjustments
Define stepwise procedures
Adjustments based not only on m but also on outcome of all the tests
Give examples of stepwise procedures
Benjamin and Hochberg adjustment
What procedures/methods control FWER?
single step procedures - bonferroni and sidak
What procedures/methods control FDR?
stepwise procedures - benjamin and hochberg
Explain the bonferroni correction.
Rejects any null hypothesis Hj with p-values less than or equal to alpha/m.
What are some characteristics of the bonferroni correction?
Strong control of FWER at level alpha
Suitable for situations where no type I error is tolerated
Not well suited when several Ho are not true (or several discoveries are expected
Low power for detection
Finish this statement: The more you control type I error….
The less power you will have.
Explain the Sidak correction
Rejects any hypothesis Hj with p-value less than or equal to 1-(1-alpha)^(l/m)
What are some characteristics of the Sidak correction?
Very similar to the bonferroni adjustment
Both are too conservative for our application(mapping)
These methods do not take into account dependence between tests (linked markers or correlated traits)
What is a disadvantage to the bonferroni and sidak corrections?
These methods do not take into account dependence between tests (linked markers or correlated traits)
What are methods that control FDR based on?
Number of rejections, not only number of tests
What are methods that control FWER based on?
number of tests
What is the general characteristics of FDR controlling methods? What does that mean for this course?
They provide weak control of FWER but have a higher power, while controlling FDR. This means we can use them for mapping!
What did benjamin and hochberg prove in 1994?
That the FDR can be controlled at some level q, by determining the largest i for which: q
What does the benjamin and hochberg adjustment assume?
That tests are independent
Explain how the Benjamin and Hochberg adjustment works.
Consider testing m null hypotheses Hi (i=l,…,m) and the corresponding computed p-values P1, P2, …,Pm ordered in ascending order. Denote Hi the null hypothesis corresponding to Pi.
Who proposed permutation tests? what was the method for?
Churchill and Doerge in 1994
Proposed a method to empirically estimate FWER rejection thresholds
What is churchill and Doerge’s method based on?
Permutation tests for simulated data
What assumptions does Churchill & Doerge/Permutation tests make with respect to the distribution of tests statistics under the null hypothesis?
It does not make any assumptions, the distribution itself is not important
What are the three steps of the permutation test algorithm?
1: Randomly Shuffles the observed phenotypes over individuals (marker genotypes)
2: Repeat #1 many times (resampling)
3: Evaluate the empirical distribution of the test statistics under the null hypothesis generate above to determine the threshold levels for CWER and FWER
Explain the first step of the permutation test algorithm in detail. (like a short answer)
Randomly shuffles the observed phenotypes over individuals (marker genotypes). This is a sample of original marker genotypes but with the phenotypic values randomly assigned, which will provide a sample of the test statistics under the null hypothesis of no-marker trait association. This breaks the association between markers and phenotypes (there is no difference between means of marker alleles).
What are the three steps of a permutation test algorithm for models with polygenic effect?
1: Randomly Shuffles the observed GENOTYPES over individuals (keep intact relationship phenotype-polygenes). This is a sample of original phenotypic values but with the genotypes randomly assigned.
2: Repeat #1 many times (resampling)
3: Evaluate the empirical distribution of the test statistics under the null hypothesis generate above to determine the threshold levels for CWER and FWER
What is the difference between a normal permutation test algorithm and one for models with polygenic effects?
Normal:
-Shuffles phenotypes over marker genotypes
-Sample is original marker genotypes with phenotypic values randomly assigned
Polygenic:
-Shuffles Genotypes over individuals, keeping intact relationship phenotype-polygenes
-Sample is original phenotypic values with genotypes randomly assigned
What number of resamplings are suggested by Churchill and Doerge to be sufficient at the 5% and 1% significance levels, respectively?
1,000 resamplings for 5%
10,000 may be needed for 1%
What does permutation tests account for?
Missing markers and differences in density of markers
What is a disadvantage to the permutation test?
it is very time consuming