Interpretation & Causality Flashcards
P-value criticism
- sharp line 0.05 is arbitrary
- p-hacking
- publication bias
- model selection with p-values -> model selection bias
-> p-values often not understood
-> NOT probability that H0 is true
Definition: probability to observe an average at least as extreme as the one observed under H0
But: low p-value =/ important
Absence of evidence =/ evidence of absence
Reasons for large p-values
- Low sample size (→ low power).
- truth is not “far” from the H0. (E.g. Small effect sizes in regression models)
- Collinear explanatory variables.
- Incorrect fitting (e.g. non-linear explanatory variables).
Suggestions for the use of p-values
- Use p-values, but don’t over-interpret them, use them properly. -> graded interpretation of p-values
- Also look at effect sizes and confidence intervals (but is also arbitrary)
- Also look at relative importances of explanatory variables. -> measure proportion of responses variability explained by each variable -> R^2 for each variable einzeln -> In R: calc.relimp() -> but this also depends on other variables
- NEVER use p-values for model selection.
Bradford-Hill-Criteria for causal inference
- Strength: A causal relationship is likely when the observed association is strong.
- Consistency: A causal relationship is likely if multiple independent studies show similar associations.
- Specificity: A causal relationship is likely when an explanatory variable x is associated only with one potential outcome y and not with other outcomes.
- Temporality: The effect has to occur after the cause.
- Biological gradient: Greater exposure should generally lead to greater incidence of the effect.
- Plausibility: A plausible mechanism is helpful.
- Coherence: Coherence between findings in the lab and in the field / population. increases the likelihood of an effect.
- Analogy: Similar factors have a similar effect.
- Experiment: Evidence from an experiment is valuable.
Experimental vs observational studies
Observational study (“Erhebung”):
- Observation of subjects / objects in a real-world (existing) situation.
- Variables are usually correlated.
- Often more variables than can be included in the model.
- Examples: Influence of pollutans (mercury) on humans, studies of wild animal populations, epidemiological studies,…
Experimental study:
- Observation of subjects / objects in a constructed (experimental) situation.
- Variables are controlled and uncorrelated (given a good study design!).
- Usually all variables enter the model, no model selection.
- Examples: Field experiments; clinical studies; psychological or pedagogical experiments,…
-> Remember: Avoid to include explanatory variables in your model that are caused by the outcome! (Collider)
When using other Einheit
Slope and standard error of this variable change
P-values, R^2, t-values, slope of other variable, intercepts, residuals -> all stay the same
-> should not compare these slopes - >dependent on Einheit
-> should rescale
Then: t-values, intercept, slope, sd are changed
P-values, R^2, residuals -> still stay the same
-> standardization leads to more comparable coefficients and easier interpretation of intercept