Multiple testing Flashcards
Causes of false positives (3)
Publication bias
Confirmation bias
Optimism bias (we get to see selected results, MIJ zal het niet overkomen)
Main question: how selected is the result we see?
Why is there so much unreproducible research?
Variability + selection = optimism bias
Variability in 1) statistical estimates 2) Conf Int 3) P-values
1) individual estimates sometimes to high or low
2) 95% of times true population estimate is captured
3) small p-value may also occur when a true null hypotheses is tested (is 5% chance all the time)
why is there selection in research?
resuts of multiple analyses are not presented equally, more interesting results are singled out, more interesting results are typically more extreme
selection in 1) estimates 2) CI 3) p values
1) selected estimeats are often extreme
2) interesting (suprising) intervals are more likely not to cover 3) selected results are likely to have small p-values (even when not true)
variability + selection = optimism bias
var: results of statistical methods sometimes look beetter or worse than they should
sel: researchers tend to emphasize good reults
OB: selected results are often the results that look better then they ar
Double use is wrong because..
Selection on the basis of data/same data
Invalidates assumptions of statistical methods
3 remedies for optimism bias
protocolling/statistical analyis plan
training sets
statistical corrections
3 methods for familywise error control
bonferonni, holm, shaffer (all correction methods for p-values)
adjust p-values to counteract the downward bias due to selection
adv: easy
disadv: only for p-valuies
family wise error?
P values for many null hypotheses, probably one will be false positive
data type (repeated measures): 2
Dose response data = repeated measurements for each patients at all doses of interest
Paired data = measurements on two or more body parts of the same patienets
Clustered data = meerdere metingen in korte periode in 1 patient (dus voor hogere accuracy)
incorrect repeated measurments (3)
ignorring independence, ignoring groupig, sepearate test for each repetition
simple solutions for repeated measurements:
- derived summaries,
- single endpoint (long function at 9 months)
- change score (last - first measure)
- average score
- individual trend (individual regression model, or two step model)
- time to particular level
- area under the curve
indivual trend regresision: simple vs two-step
- regression line per patient , interpretaion by eye balling
- regression line for each individual patient, analysis of intercepts and slopes (avere trend, differenct intercept and slopes per group). regression analysis for explanation
long vs wide format
wide format: handig als alle individuele tijdpnten voor alle patienten hetzelfde zijn
long format: handiger als er verschillende punten zijn per patient