Evidence Base + DiD & RD Flashcards

1
Q

What are the three components to classification of data science?

A

Description: Using data to provide a quantitative summary of certain features of the world

Prediction: using data to map some features of the world to other features of the world

Counterfactual prediction: using data to predict features of the world as if the world had been different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is often required for description, prediction, and counterfactual prediction?

A

Statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the regression equation?

A

Y = alpha + beta(x) + episilon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does epsilon represent in the regression equation?

A

Random Error term (tries to correct for the “other stuff”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does beta represent in the regression?

A

The change in y due to x (duh)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are common regression models?

A
  • linear regressions (super common)
  • logistic regression (nonlinear relationships)
  • two-part model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Does a regression prove causation?

A

No…duh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some of the shortcomings of regressions?

A
  • can be subject to reverse causation, simultaneity, or third factor causation
  • regression is inconsistent if dist of estimates does not converge to true value as sample grows
  • bias and inconsistency lead to false + and/or -
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a CDAG (Casual Directed Acyclic Graph)?

A

Way to guide regression analysis as it shows the director of hypothesized causal effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What can DAGs include?

A

Exposure (E): independent variable
Outcomes (O): dependent variable
Mediator (M): mechanism that transmits effect from E to O
Confounder (C): causer of both E and O
Collider (S): something caused by both E and O

EOMCS - everyone owes me cash sir

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is done with mediators and colliders?

A

Mediators - usually controlled for or analyzed w/ careful mediation analysis
Colliders - hard to deal with but often used to help think of limitations to the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are ways to deal with data missing at random

A
  • complete case analysis
  • last observation carried forward
  • mean value imputation
  • random imputation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is internal validity?

A

Stat concept that requires any inferences about casual effects to be valid in the population

English: Degree to which findings can be attributed to an independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is external validity?

A

That findings can be generalized from the population and setting to other populations and settings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How randomized is health care policy?

A

Like never lol (20% of studies are)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Do health policies usually make an evaluation plan before rolling out?

A

Nope

17
Q

What are two key changes that need to be made to improve data quality in healthcare?

A
  • reducing measurement burden by developing systems that autocollect date
  • favor electronic health records
18
Q

What do randomized control trials get rid of that make it so great?

A

selection bias

19
Q

What was the Rand Health Insurance Experiment?

A

An experiment from 1974-1982 in which individuals were randomized to various insurance plans w/ diff levels of cost sharing

Free care lowered blood pressure, improved hyperptension, and reduced serious symptoms among poor patients

20
Q

What was the Oregon Medicaid Lottery?

A

Granted adults the opportunity to apply for Medicaid in 2008, Finkelstein & Baicker found that health care utilization increased, and self reported health increased on average. However, ED visits increased

21
Q

Why aren’t RCTs more common in health studies?

A

They can be infeasible, invalid, or not generalizable

22
Q

What kind of experiments are great when randomization isn’t possible or ethical?

A

Natural or Quasi Experiments!

23
Q

What are the 3 Quasi-Experimental Approaches?

A
  • DiD
  • Regression Discontinuity
  • Instrumental variables (Uses an exogenous element that nudges some subject towards receiving a certain treatment like a lottery)
24
Q

What are the 3 Quasi-Experimental Approaches?

A
  • DiD
  • Regression Discontinuity
  • Instrumental variables (Uses an exogenous element that nudges some subject towards receiving a certain treatment like a lottery)
25
Q

What are some examples of things that were breakthroughs that didn’t come out of RCTs?

A
  • hand washing reduces infant mortality
  • smoking increases the risk of lung cancer
  • repeated head injury may CTE
26
Q

RCTs are the best approach in terms of ___ validity, but ___ validity is another matter

A

Internal, external

27
Q

Describe the “gold standard” RCT

A
  • trial: experimenter controls elements of study design
  • control: one group that receives treatment and the other doesnt
  • randomized: no selection bias, observed difference in outcomes = treatment effect + selection bias
28
Q

Describe a quasi-experimental design (non-randomized control studies)

A
  • trial turns into study: experimenter no longer has direct control of the setting
  • control: still a treatment and control
  • non-randomized: selection bias is a potential issue
29
Q

What settings are common for the use of Difference in Differences?

A

Estimating the impact of a policy change on some outcome of interest, where the policy change occurs at some specific point in time

30
Q

Why can’t you just compare health status before and after a health policy?

A

Other things could have happened that affected health besides just the policy

31
Q

What is the actual difference in differences?

A

The change in the treatment group minus the data from an alternate reality where the policy didnt happen

32
Q

What kind of data is required for DiD?

A

Panel data (units over time)

33
Q

What is the key idea that allows DiD to find the effect of policies on healthcare?

A

You can isolate the treatment effect by taking the difference of the change in the outcome between your treatment and control group, removing the effect of other stuff as it effects both groups outcomes

34
Q

What is the assumption of DiD?

A

The “other stuff” is assumed to be identical between treatment and control groups (called the parallel trends assumption)

35
Q

When is RD applied?

A

Estimating the impact of some treatment on some outcome of interest when in regards to a threshold

36
Q

What do the treatment and control groups look like for an RD experiment?

A

Treatment: units that receive treatment because they are barely just above threshold
Control: units that do not receive the treatment because they are just barely below the threshold of treatment

37
Q

What are the assumptions of RD?

A

Assumption 1: the difference is entirely due to the treatment of interest
Assumption 2: the difference in the “other stuff” is likely to be minimal