Brazy's Exam Hints? Flashcards
What is external validity?
The knowledge we gain IN the study can be applied OUTSIDE the study.
What is internal validity?
How well a study is conducted and how accurately its results reflect the studied group
How does validity apply to Experimental causal design?
In experimental causal design, internal validity is good in randomised experiments. External validity depends on how representative our sample is on the broader population that we want to make inferences on.
Define MIDA elements
Model: a representation of an event generating process, it’s essentially a theory - “A set of logically related symbols that represent what we think happens in the world”
Inquiry: The research question in the context of our model
Data strategy: How we measure and operationalise the elements of our model and inquiry
Answer strategy: How we summarise and explain the data
What is the function of the model?
We use models to justify a causal relationship. Models identify units, condition/treatment, and potential outcomes.
What is the function of an inquiry?
The inquiry tries to find a theoretical answer, “an estimand,” using information from our model. Inquiries can be causal or descriptive.
What is the function of a data strategy?
The function of a data strategy is to select the units of study, sample, using conditions (applying treatment) and measuring the outcomes. This is the method of our study.
Conditions can be observed or assigned, what does this mean?
Conditions (treatment) that are observed are natural variations. There is no manipulation, we are simply observing it.
Assigned conditions is experimental variation. We have manipulated the conditions to see if it changes the outcome.
What is the function of an answer strategy?
The answer strategy cleans the data we get from our data strategy and presents and interprets them
What is the difference between an estimate and an estimand?
The estimate is the data that we get, the estimand is what we seek.
Answer strategies can be statistical or qualitative, give examples of each.
Statistical = statistical estimators
Qualitative = case study approach
Mixed methods = text or network analysis tools
What are the research design principles?
Design Holistically: All parts must work together to get a good result.
Design Agnostically: A good design should work well even when the world is different from what we expect
Design for purpose: Design should be aligned with the specified purpose of the research that are captured by the diagnosands
Design early: Design first because it’s hard to go back once data strategies are implemented
Design often: update your designs as circumstances change
Design to share: Replicability, other researchers can run your design and question the logic of your research
What are exogenous variables?
Variables that are not caused by others, can be randomly assigned (eg, treatment)
What are endogenous variables?
Endogenous variables are caused by others (eg, outcome, covariates, etc)
What are outcome (Y) variables?
Variables we want to understand, dependent variables, is affected by other variables
What are treatment (D) variables?
Variables of interest that explain outcome (Y), independent variables, can be randomly assigned
What are Moderators (X2) variables?
Variables that affect the outcome (Y) but are unrelated to the treatment (D)
What are Confounders (X) variables?
Variables that introduce a non causal relationship between treatment (D) and outcome (Y). Affects both D and Y
What are Collider variables?
Collider varaibles are caused by both treatment (D) and outcome (Y), conditioning them introduces bias
What are Mediator (M) variables?
Mediator variables are variables on the causal path from treatment (D) to outcome (Y)
What are Instrumental (Z) variables?
Instrumental variables are exogenous variables affecting treatment, then it affects outcomes.
Exclusion restriction means that there should be no other path in our instrumental path.
What is a latent variable (Y*)?
These are underlying concepts that cannot be measured (like sensitivity bias). We use proxy indicators to measure them but each proxy has its own limitations.
OBSERVATIONAL DESCRIPTIVE
What is an index creation for latent variables. Why would you use them?
What are the potential pitfalls?
Indexes combines multiple proxy indicators into a single score to represent a latent variable (Y*)
It tries to measure a variable that’s difficult to measure (latent variable) by combining different indicators into one index. (eg, trying to measure happiness by counting frequency of smiles, laughs, etc)
MainPitfalls include
- Unknown scale (don’t know if indicators are accurate measures of latent variable)
- Combining measures (Just because variables are correlated doesn’t mean they measure the same latent variable)
OBSERVATIONAL DESCRIPTIVE
There are three main approaches to index creation: Scale and Average, Scaled Average and Principal Component Analysis (PCA). Explain them.
Scale and Average: Standardise each proxy indicator (scale) and then take the average of the standardised scores (average)
Scaled Average: Same as Scale and Average but adjust for a covariate
PCA: Identifies the first factor (principal component) that captures the most variance shared by all proxy indicators => index score
What is a functional relationship?
How endogenous variables are produced; studying the cause (D) and effect (Y)
What is the difference between parametric and non-parametric functional forms?
Parametric models contain more assumptions about the nature of the relationship between cause and effect.
Non-Parametric models simply state that there is a relationship but we don’t know why or how. DAGs conceptualise this.
What are DAGs trying to do?
Represent causal relationships between variables. Graphical representation of models informed by theory.
What are backdoor paths?
How do you close them?
Backdoor paths are non causal relationships between our treatment and outcome. (eg, bias)
Two ways to close open backdoors:
1.Introduce a confounder variable (X) and condition them by ‘adding’ a control
- When the confounder is a collider, the backdoor path is closed. NEVER control for colliders.
If all backdoor paths have been closed, then we have met the backdoor criterion and you can credibly argue for causal inference.
OBSERVATIONAL CAUSAL
What are the assumptions for instrumental variables?
Exogeneity: there is no confounder between Z and Y
Excludability (No other direct effect): Z only affects A through D
Monotonicity: The effect of the instrument on treatment is 0 or positive for all units.
Assumptions should be based on the theory/model.
ANSWER STRATEGY
In hypothesis testing, there is always uncertainty and error. What two types are there?
TYPE I ERROR: false positive, we rejected the null hypothesis when it is actually true.
(Eg, lump declared cancer (H1). You go through chemo, it wasn’t cancer, you die)
TYPE II ERROR: false negative, we accept the null hypothesis but it’s actually false.
(Eg, lump wasn’t declared cancer (H0), you don’t go through chemo, you die)
What is the difference between causal and descriptive research design?
Causal research: identify cause and effect relationship between variables
Descriptive research: identify relationships that are not necessarily causal.