Evidence Base + DiD & RD Flashcards
What are the three components to classification of data science?
Description: Using data to provide a quantitative summary of certain features of the world
Prediction: using data to map some features of the world to other features of the world
Counterfactual prediction: using data to predict features of the world as if the world had been different
What is often required for description, prediction, and counterfactual prediction?
Statistical inference
What is the regression equation?
Y = alpha + beta(x) + episilon
What does epsilon represent in the regression equation?
Random Error term (tries to correct for the “other stuff”)
What does beta represent in the regression?
The change in y due to x (duh)
What are common regression models?
- linear regressions (super common)
- logistic regression (nonlinear relationships)
- two-part model
Does a regression prove causation?
No…duh
What are some of the shortcomings of regressions?
- can be subject to reverse causation, simultaneity, or third factor causation
- regression is inconsistent if dist of estimates does not converge to true value as sample grows
- bias and inconsistency lead to false + and/or -
What is a CDAG (Casual Directed Acyclic Graph)?
Way to guide regression analysis as it shows the director of hypothesized causal effects
What can DAGs include?
Exposure (E): independent variable
Outcomes (O): dependent variable
Mediator (M): mechanism that transmits effect from E to O
Confounder (C): causer of both E and O
Collider (S): something caused by both E and O
EOMCS - everyone owes me cash sir
What is done with mediators and colliders?
Mediators - usually controlled for or analyzed w/ careful mediation analysis
Colliders - hard to deal with but often used to help think of limitations to the study
What are ways to deal with data missing at random
- complete case analysis
- last observation carried forward
- mean value imputation
- random imputation
What is internal validity?
Stat concept that requires any inferences about casual effects to be valid in the population
English: Degree to which findings can be attributed to an independent variable
What is external validity?
That findings can be generalized from the population and setting to other populations and settings
How randomized is health care policy?
Like never lol (20% of studies are)