Brazy's Exam Hints? Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is external validity?

A

The knowledge we gain IN the study can be applied OUTSIDE the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is internal validity?

A

How well a study is conducted and how accurately its results reflect the studied group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does validity apply to Experimental causal design?

A

In experimental causal design, internal validity is good in randomised experiments. External validity depends on how representative our sample is on the broader population that we want to make inferences on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define MIDA elements

A

Model: a representation of an event generating process, it’s essentially a theory - “A set of logically related symbols that represent what we think happens in the world”

Inquiry: The research question in the context of our model

Data strategy: How we measure and operationalise the elements of our model and inquiry

Answer strategy: How we summarise and explain the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the function of the model?

A

We use models to justify a causal relationship. Models identify units, condition/treatment, and potential outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the function of an inquiry?

A

The inquiry tries to find a theoretical answer, “an estimand,” using information from our model. Inquiries can be causal or descriptive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the function of a data strategy?

A

The function of a data strategy is to select the units of study, sample, using conditions (applying treatment) and measuring the outcomes. This is the method of our study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Conditions can be observed or assigned, what does this mean?

A

Conditions (treatment) that are observed are natural variations. There is no manipulation, we are simply observing it.

Assigned conditions is experimental variation. We have manipulated the conditions to see if it changes the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the function of an answer strategy?

A

The answer strategy cleans the data we get from our data strategy and presents and interprets them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between an estimate and an estimand?

A

The estimate is the data that we get, the estimand is what we seek.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Answer strategies can be statistical or qualitative, give examples of each.

A

Statistical = statistical estimators

Qualitative = case study approach

Mixed methods = text or network analysis tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the research design principles?

A

Design Holistically: All parts must work together to get a good result.

Design Agnostically: A good design should work well even when the world is different from what we expect

Design for purpose: Design should be aligned with the specified purpose of the research that are captured by the diagnosands

Design early: Design first because it’s hard to go back once data strategies are implemented

Design often: update your designs as circumstances change

Design to share: Replicability, other researchers can run your design and question the logic of your research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are exogenous variables?

A

Variables that are not caused by others, can be randomly assigned (eg, treatment)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are endogenous variables?

A

Endogenous variables are caused by others (eg, outcome, covariates, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are outcome (Y) variables?

A

Variables we want to understand, dependent variables, is affected by other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are treatment (D) variables?

A

Variables of interest that explain outcome (Y), independent variables, can be randomly assigned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are Moderators (X2) variables?

A

Variables that affect the outcome (Y) but are unrelated to the treatment (D)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are Confounders (X) variables?

A

Variables that introduce a non causal relationship between treatment (D) and outcome (Y). Affects both D and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are Collider variables?

A

Collider varaibles are caused by both treatment (D) and outcome (Y), conditioning them introduces bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are Mediator (M) variables?

A

Mediator variables are variables on the causal path from treatment (D) to outcome (Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are Instrumental (Z) variables?

A

Instrumental variables are exogenous variables affecting treatment, then it affects outcomes.

Exclusion restriction means that there should be no other path in our instrumental path.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a latent variable (Y*)?

A

These are underlying concepts that cannot be measured (like sensitivity bias). We use proxy indicators to measure them but each proxy has its own limitations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

OBSERVATIONAL DESCRIPTIVE

What is an index creation for latent variables. Why would you use them?
What are the potential pitfalls?

A

Indexes combines multiple proxy indicators into a single score to represent a latent variable (Y*)

It tries to measure a variable that’s difficult to measure (latent variable) by combining different indicators into one index. (eg, trying to measure happiness by counting frequency of smiles, laughs, etc)

MainPitfalls include
- Unknown scale (don’t know if indicators are accurate measures of latent variable)
- Combining measures (Just because variables are correlated doesn’t mean they measure the same latent variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

OBSERVATIONAL DESCRIPTIVE

There are three main approaches to index creation: Scale and Average, Scaled Average and Principal Component Analysis (PCA). Explain them.

A

Scale and Average: Standardise each proxy indicator (scale) and then take the average of the standardised scores (average)

Scaled Average: Same as Scale and Average but adjust for a covariate

PCA: Identifies the first factor (principal component) that captures the most variance shared by all proxy indicators => index score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a functional relationship?

A

How endogenous variables are produced; studying the cause (D) and effect (Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the difference between parametric and non-parametric functional forms?

A

Parametric models contain more assumptions about the nature of the relationship between cause and effect.

Non-Parametric models simply state that there is a relationship but we don’t know why or how. DAGs conceptualise this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are DAGs trying to do?

A

Represent causal relationships between variables. Graphical representation of models informed by theory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are backdoor paths?
How do you close them?

A

Backdoor paths are non causal relationships between our treatment and outcome. (eg, bias)

Two ways to close open backdoors:

1.Introduce a confounder variable (X) and condition them by ‘adding’ a control

  1. When the confounder is a collider, the backdoor path is closed. NEVER control for colliders.

If all backdoor paths have been closed, then we have met the backdoor criterion and you can credibly argue for causal inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

OBSERVATIONAL CAUSAL

What are the assumptions for instrumental variables?

A

Exogeneity: there is no confounder between Z and Y

Excludability (No other direct effect): Z only affects A through D

Monotonicity: The effect of the instrument on treatment is 0 or positive for all units.

Assumptions should be based on the theory/model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

ANSWER STRATEGY

In hypothesis testing, there is always uncertainty and error. What two types are there?

A

TYPE I ERROR: false positive, we rejected the null hypothesis when it is actually true.
(Eg, lump declared cancer (H1). You go through chemo, it wasn’t cancer, you die)

TYPE II ERROR: false negative, we accept the null hypothesis but it’s actually false.
(Eg, lump wasn’t declared cancer (H0), you don’t go through chemo, you die)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the difference between causal and descriptive research design?

A

Causal research: identify cause and effect relationship between variables

Descriptive research: identify relationships that are not necessarily causal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the difference between experimental and observational research design?

A

Experimental: Conditions have been manipulated by us to understand the relationship between two variables

Observational: We observe natural variations about the relationship between two variables

33
Q

EXPERIMENTAL DESCRIPTIVE

What are they?
What is the treatment assignment?
Give examples.

A

The inquiry is descriptive, but researcher assigns the treatment/manipulates conditions.

So, how is it not causal? Well, we aren’t looking to identify causal relationships, we are trying to find a latent variable that might explain the relationship between two variables (like discriminatory views, ideological support, etc)

Examples: Audit, List, Conjoint experiments and Behavioural games

34
Q

EXPERIMENTAL CAUSAL

What are they?
What is the treatment assignment?
Give examples.

A

Experimental causal research design tries to understand causal relationships by studying what actually happened to what would have happened if the conditions were different (comparing counterfactual states). Random assignment is used. to manipulate conditions.

Examples:
Two Arm Randomised experiments

Two Arm design with pre treatment covariates

Block Randomised experiement

Cluster randomised experiments

Subgroup designs

Factorial experiments

Encouragement design,

Placebo-controlled experiment

Stepped wedge experiments

Randomised saturation experiments

35
Q

OBSERVATIONAL DESCRIPTIVE

What are they?
What is the treatment assignment?
Give examples.

A

The inquiry is descriptive and there is no treatment effect because we want an answer to an observation.

Examples: Surveys, official statistics, etc.

What matters is our sampling method as it will dictate how accurate our observations are.

36
Q

Why is sampling important in observational descriptive designs? Give examples of sampling methods.

A

Our sampling method will dictate how accurate our observations are.

Examples:
Cluster sampling

Intra -Cluster Correlation (ICC)

Multilevel regression and Post stratification

Partial pooling

Index Creation for latent variables

37
Q

OBSERVATIONAL CAUSAL

What are they?
What is the treatment assignment?
Give examples.

A

Our inquiry is causal but we use data that already exists. Researchers do not assign treatment, the ‘natural world’ does, be it random or not.

Example:
Process Tracing

Selection on Observables (multivariate regression)

Difference in Difference (DiD)

Instrumental Variables ‘two stage process’

38
Q

What is the fundamental problem of causal inference?

A

We want to understand the difference between potential outcomes of our counterfactuals, but we can only see one outcome (treated vs, untreated)

39
Q

What is the potential outcome/counterfactual concept? How does it relate to research design?

A

Potential outcomes are the outcomes of our counterfactuals.
‘What if’ scenario outcomes.

40
Q

Explain Figure 8.2 (Sampling)

A

Figure 8.2: Nine kinds of random sampling. In the first row individuals are the sampling units, in the second row clusters are sampled, in the third clusters are sampled and then individuals within these clusters are sampled. In the first column units are sampled independently, in the second units are sampled to hit a target, in the third units are sampled to hit targets within strata.

41
Q

Explain Figure 8.6 (Step Wedge Random Assignment)

A

Stepped-Wedge Random Assignment Procedure: Imagine you’re doing an experiment where you want to give some people a treatment, like information on how to vote, over several time periods. Instead of giving everyone the treatment at once, you divide them into groups. In the first period, only some get the treatment. Then in each following period, more people get the treatment until everyone has it.
The figure shows how the treatment is assigned over these three periods. For example, in the first period, one third of the units receive the treatment. In the second period, another third receive the treatment, and in the final period, the remaining units are treated.

42
Q

Explain random sampling

A

Every unit has an equal chance of being selected. Design based inference.

Example:

Simple random - ‘every citizen has a 10% chance of being picked’

Stratified Random - Every unit within a group has the same chance of being sampled ‘every citizen under 18 has 10% chance of being picked’

Cluster - randomly pick subgroups within your population, then sample all individuals within the group ‘every school in the country has a chance at being picked, 5 were picked and every student in each school was was sampled’

Multistage - Take a cluster sample of your population, then take a cluster sample of the first sample, then cluster sample from the second sample. This is your representative sample.

43
Q

Explain non random sampling, why would you use it?

A

Selecting individuals for sample. Used for targeting specific groups for study. Model based inference.

Examples

Convenience sampling: Selecting a sample based on what’s available at the moment. Bad external validity, potential bias. Can only analyse Sample Average Treatment Effect (SATE)

Purposive Sampling: hand picking the participants of your study based on specific criteria.

Respondent driven ‘Snowball’ sampling: ask respondents to recommend others.

44
Q

Explain treatment assignment

A

Who gets the treatment and how is that being chosen. Similar to sampling assignment. Only used in experimental designs as descriptive designs, the treatment has already been given.

45
Q

EXPERIMENTAL CAUSAL

Explain Two Arm Randomised Experiment design with random assignment

A

There are only two conditions: treated and untreated/control.

Examples

Simple random assignment: Everyone has the same chance of getting treatment

Complete random assignment: Everyone has the same chance of getting treatment but we need a certain number for our sample

Block Random assignment: divide population into ‘blocks’ or groups based on specific characteristics (same gender) important to the study; then randomly assign within blocks. Good internal validity

Cluster Random assignment: group people into ‘clusters’ (same neighbourhood), randomly sample clusters, assign treatment to all individuals in the cluster.

Block and Cluster Assignment: Units are assigned as clusters and clusters are nested within blocks.

Saturation Random Assignment: Different clusters are assigned a saturation level of treatment, you then randomly assign individuals within the clusters to receive the prestated saturation level of treatment.

46
Q

EXPERIMENTAL CAUSAL

Explain Multi Arm design treatment

A

There are more than two conditions: treatment 1, treatment 2, treatment 3, untreated/control

You randomly assign a fixed number of participants to each condition groups.

47
Q

EXPERIMENTAL CAUSAL

Explain Factorial design treatment

A

A factorial design is when you study the effects of multiple treatments or factors simultaneously. It allows you to see how each treatment affects the outcome on its own and how they interact with each other.

It means studying a group that doesn’t watch a propaganda video but talks with trump, studying a group that watches the video but doesn’t talk with trump, studying a group that does both, and studying a group that does neither.

48
Q

EXPERIMENTAL CAUSAL

Explain over time design treatment

A

Over time designs give treatment to units over multiple time periods. Instead of comparing treatment, you look at how outcomes change within the same units over time (before and after effects).

Examples

Stepped Wedge design:
Assign treatment to different units in different time periods, gradually increasing until all units have received the treatment.

Crossover design:
Units are initially assigned one treatment, then in a later period, they ‘cross over’ to the opposite treatment

49
Q

What is Non-Randomised Treatment Assignment? Give examples

A

Researchers assign treatment on chosen units.

Alternating assignment: Even numbered participants receive treatment, Odd numbered participants don’t receive treatment.

Regression discontinuity/cutoff: A cut off ‘score’ is assigned to the sample. Those above the cutoff receive treatment, those below do not. Researchers compare the outcomes of the two groups.

Bayesian: The Bayesian approach in non randomised assignment uses prior information to make educated guesses about the treatment effectiveness for different individuals. It then ‘optimally’ assigns treatments based on these predictions to maximise the effects of the treatment for each person.

50
Q

What is standard error?

A

How far estimates are from the expected value for a given sample

51
Q

ANSWER STRATEGIES

What are the four different types of answer strategies?

A
  1. Point Estimation: estimate a single value that represents a parameter of interest using descriptive statistics (mean) or regression coefficient Then calculate standard error.
  2. Hypothesis testing: They are quanitative or qualitative tests. We state there is NO relationship and to get our answer we hope to reject the null hypothesis.
    We use p-values to give you the probability that your estimate could have occurred if the true population was the sample.
  3. Bayesian Formulations: Prior beliefs are incorporated. Bayesian answers are the probabilities of answers being right (ie, the estimand)
  4. Interval Estimation: Rather than a point or probabability, we estimate a range of answers where we think the estimand lies with some degree of confidence.
52
Q

ANSWER STRATEGIES

Interval estimation is how we interpret answers in Bayesian, Frequentist and Extreme Bounds designs. Explain how.

A

In Frequentist designs, we use a 95% confidence interval (95% of the time our estimand will lie in this range)

In Bayesian designs, there is a 95% chance the estimand is in this range. Because Bayesian designs are more informed, the confidence interval may be more narrow.

In Extreme Bounds, it is the best and worst case scenario in our sample - on each end.

53
Q

ANSWER STRATEGIES

What is the difference between Bayesian and Frequentist answer strategies?

A

The main difference is that bayesian formulation used informed priors while frequentists don’t.

54
Q

What is the Linear Regression Equation (OLS)?
Identify all the elements and explain why we might use it.

A

y = βo + β1x + u

y = outcome variable (dependent)

x = explanatory variable (independent)

u = error term
(difference between observed values and values predicted by regression, represents all other factors that influence the outcome variable but not included in the model

βo = intercept parameter
(value of the outcome variable when the explanatory variable is zero)

β1 = slope parameter
(the change in the outcome variable for a one unit change in the explanatory variable. Indicates slope strength and direction)

Why do we use it?

  1. We use the linear regression model minimise the amount of error in our fitted equation.
  2. It can also help us control for confounding variables.
  3. Measure significance of relationships.
  4. Make predictions about future outcomes from what we know about the relationship and provide a basis for future models.
55
Q

ANSWER STRATEGY

How do you choose answer strategies?

A

Plug in principle: Estimate a parameter and then ‘plug in’ observable data into those parameter

Analyse as you randomise: Adapting the analysis method (answer strategy) to account for imperfections in our random sampling (data strategy)

56
Q

DECLARE DESIGN

What is bias?

A

Systematic error in estimation; average estimate consistently differs from true value.

Because of other factors affecting outcome that have not been addressed

57
Q

DECLARE DESIGN

What is power?

A

The probability of a statistically significant result.

We want more ‘power’ to reject the null hypothesis which means we need a higher chance to get a statistically significant result.

58
Q

DECLARE DESIGN

What is RMSE?

A

Root Mean Squared Error is a diagnosand quantity that tells us how many outliers/variance there is in our estimate.

59
Q

Explain the elements in the DeclareDesign package

A
  1. declare_model()
    Includes, number of units, unit characteristics, potential outcomes, treatment effect sizes, effect heterogeneity
  2. declare_inquiry()
    Specifies the research question and the answer you’re seeking (the estimand). This could be a causal or descriptive inquiry. Can declare estimands for groups (CATEs, difference in CATEs)
  3. declare_sampling()
    Outlines the procedures used to select your sample units from the population.
  4. declare_assignment()
    Defines how treatments or conditions are assigned to the units in your study.
  5. declare_measurement()
    Details the procedures used to measure the variables in your research (eg, index creation)
  6. declare_estimator()
    Specifies the method for estimating the answer (e.g., linear regression, point estimate, interval estimate) based on your data and inquiry.
  7. declare_test()
    All these functions put together. Defines the statistical tests used to assess the significance of your findings.
60
Q

OBSERVATIONAL DESCRIPTIVE

What is ICC? What does it mean if the coefficient is high, what does it mean if coefficient is low, what is the range?

A

Inter cluster correlation represents the difference of variation between clusters and within clusters.

If ICC is high (close to 1) then it means that most variation in outcomes is due to differences between clusters

If ICC is low (close to 0) then it means that most of the variation in outcomes is due to differences between individuals within clusters.

Example:
Imagine we’re studying political preferences in neighborhoods (clusters) within a city.
We randomly select several neighborhoods and survey people in each one about their political views.

We find that in neighborhoods with high ICC, like tight-knit communities, most people have similar political preferences. So, if ICC = 0.8, it means 80% of the variation in political preferences is due to differences between neighborhoods, and only 20% is due to differences between individuals within the same neighborhood.

On the other hand, in neighborhoods with low ICC, like diverse or transient areas, people have very different political preferences. So, if ICC = 0.2, it means only 20% of the variation in political preferences is due to differences between neighborhoods, and 80% is due to differences between individuals within the same neighborhood

61
Q

OBSERVATIONAL DESCRIPTIVE

What is partial pooling research design approach? What does it do?

A

Balance between full and non pooling. Both shared and individual-level estimates within a hierarchical model. It borrows strength from both group-level and individual-level data. Is used in MRP.

Statistical technique used to make better estimates by borrowing information from similar groups or clusters when we have limited data from each individual group.

If you’re trying to measure public opinion in different states, but not enough from each state. You combine all the information from all states and using a 2 step process (MRP) you get a better estimate for each individual state.

You use a plug in model into a regression analysis and then adjust these estimates based on what we know about the population of each state.

62
Q

OBSERVATIONAL DESCRIPTIVE

What is multi-level regression and post stratification (MRP)?

A

MRP is used to estimate population level quantities for smaller subgroups

For example, we want to analyse public opinion oh high school graduates and non graduates across the US.

First, multilevel regression is used to estimate average opinions of these two groups across each state.

Second, poststratification is used to ‘reweight’ these estimates to match the proportions of each groups within each states. So we might have a lot more HS graduates in California then in Wyoming so we can borrow californian data to help wyoming.

63
Q

Understand treatment assignment in an observational design (quasi experimental approach)

A

In observational descriptive, no treatments are allocated by the researcher.

In observational causal, the natural world is responsible for the observed treatment conditions.
(Causal inferences imagines the outcomes had they not been treated)

64
Q

OBSERVATIONAL CAUSAL

Challenges on observational causal treatment assignment

A

Since the natural world provides the treated conditions and researchers simply observe the treated conditions and try to make causal inferences on counterfactuals, it is difficult to infer causality because it relies on specific circumstances to occur in the natural world.

Researchers don’t have the ability to control the treatment assignment at all.

Process tracing necessitates finding the right clues to understand causality.

Selection-on-observables requires measuring variables that eliminate alternative explanations for the observed outcomes.

Difference-in-difference designs need stable patterns over time to make valid comparisons.

Instrumental variables designs rely on nature randomly assigning a variable that can be measured.

Regression discontinuity designs hinge on a clear cutoff point that determines treatment.

65
Q

OBSERVATIONAL CAUSAL

What is process tracing? What is the hoop clue and smoking gun?

A

Process tracing is a qualitative research method where we try to observe data (usually in a case study) to see whether there is a causal relationship or not.
(eg, trade preference = increase in exports?)

A ‘hoop clue’ is a piece of evidence that strongly suggests a causal link.
(eg, customs declaration forms that used trade preference)

A ‘smoking gun’ is the exact piece of evidence that directly confirms the causal relationship.
(eg, exporters using trade preference leads to higher exports)

66
Q

OBSERVATIONAL CAUSAL

What is DiD?
What are the assumptions of DiD? (parallel trends assumption)

A

DiD evaluates the causal effect of a treatment. It compares the before and after ATT (average treatment effect of the treated) to a control group.

The DiD measures the difference between pre and post treatment outcomes and then subtracts it from the control group.

An important assumption of the DiD design is the ‘parallel trends assumption’. This assumption states that the changes in outcomes over time would have been the same for the treated and untreated groups if the treatment had not been implemented.

DiD is often used in settings with two time periods (before and after) and two groups (treated and untreated). For more complex studies with multiple periods/groups, you can use panel data analysis but parallel trends assumption becomes weaker.

67
Q

OBSERVATIONAL CAUSAL

Discuss the instrumental variable (IV) approach.
What kind of treatment effect does it focus on?

A

The IV approach aims to study the effect of a treatment on an outcome. But confounding variables can influence the outcome too. The IV approach controls for other factors that might affect both treatment and outcomes.

For example, you want to see how studying affects good grades. But you cannot tell 50% of the class to study harder (assign treatment). So you might look (observe) at naturally talented (instrument) students who tend to study more (treatment) than others. Ideally, being talented shouldn’t affect your grades (outcome) other than through studying more.

It’s a two step approach.
1. Identify the effects of instrument (Z) on treatment (D)
See how talent affects studying: talented students generally study more than non talented students - talent ‘nudges’ student to study more

  1. Isolate the effect of treatment (D) by subtracting the treatment effects that are caused by Z.
    Assuming talent only affects grades through studying, we can use the difference in studying habits between talented and non talented students to estimate the true effect of studying on grades.

The effect of the treatment on outcome is only measurable for compliers = Local Average Treatment Effect (LATE)

LATE:
In an IV approach there are four groups:
Compliers (Instrument nudges treatment in the right direction)
Never takers (Population that doesn’t take treatment despite instrument)
Always takers (Population that always takes treatment despite instrument)
Defiers (Instrument nudges treatment in wrong direction).

68
Q

OBSERVATIONAL CAUSAL

What are Regression Discontinuity Designs (RDD)?
What kind of treatment effects do you get from an RDD?

A

RDD designs assign treatment based on a cutoff point of a certain characteristic (running variable). If you are above the cut off point you get treatment (D) and if not, you don’t receive it.
The study focuses on those below the threshold (control) that didn’t get the treatment, and those slightly above the threshold (received treatment, but not that much different from those that didn’t).
Therefore it only studied the LATE.
It assumes that the potential outcomes are smooth throughout the threshold (ie, if there was no treatment, there would be no sudden effect on Y if you reached the threshold)

69
Q

EXPERIMENTAL CAUSAL

What are Two Arm Randomised Experiments?
What kind of treatment effects do you get from a Two Arm design

A

Two arm randomised experiment designs are when subjects are assigned to the treatment or the control. Researcher then measures the difference.
Different strategies of random assignment (simple, block, cluster, etc)
We assume the Stable Unit Treatment Value Assumption (SUTVA): outcome doesn’t change just because a different group got the treatment.

It can only make inferences to the sample, therefore it can only measure the Sample Average Treatment Effect (SATE). Internal validity is good, external validity is dependent on how representative the sample is on broader pop.

70
Q

EXPERIMENTAL CAUSAL

What are Subgroup Designs? What kind of treatment effect do you get from subgroup designs

A

Subgroup designs are used to study how the effect of treatment might differ between groups of people (CATEs)

They are interested in measuring the Conditional Average Treatment Effect (CATE) between subgroups.

CATE is basically the average effect of the treatment for a specific group of people.

Non random sampling + treatment assignment: needs to be representative, subgroups are naturally occurring
Subgroup designs provide a descriptive difference but can’t prove causal link between being in a subgroup and effect of treatment

Similar to factorial designs but factorial design are randomly assigned treatment.

71
Q

EXPERIMENTAL CAUSAL

What are Cluster randomised experiments? What kind of treatment effect do you get?

A

Use cluster random sampling to understand causal link between treatment and effect.

It measures the CATE (conditional average treatment effect) as it only measures treatment effect at the people belonging to the cluster.

Remember high ICC (higher variance within the cluster), low ICC (higher variance from cluster to cluster)

72
Q

EXPERIMENTAL DESCRIPTIVE

Which MIDA elements are in audit experiments?

A

MODEL: units, assumptions about behaviours

INQUIRY: descriptive, summary of unit characteristics in the population, and not as a function of potential outcomes. Just wants to see proportion of sample that responded to treatment (not trying to infer a causal link, not ATE)

DATA STRATEGY: Random assignment

ANSWER STRATEGY: Difference in means to estimate the inquiry

No counterfactuals. We define the inquiry as a summary of unit characteristics in the population, and not as a function of potential outcomes.

73
Q

EXPERIMENTAL DESCRIPTIVE

Which MIDA elements are in list experiments?

A

MODEL: Units, assumptions about behaviours

INQUIRY: descriptive, prevalance rate (not ATE), summary of unit characteristics in the population, and not as a function of potential outcomes.

DATA STRATEGY: Random assignment

ANSWER STRATEGY: Difference in means to estimate the inquiry

No counterfactuals. We define the inquiry as a summary of unit characteristics in the population, and not as a function of potential outcomes.

74
Q

EXPERIMENTAL DESCRIPTIVE

Which MIDA elements are in conjoint experiments?

A

MODEL: units (survey respondents), behavioural assumptions

INQUIRY: Descriptive. The specific inquiry (AMCE) depends on the researcher’s choices about the attribute levels and randomisation scheme used in the experiment design.

DATA STRATEGY: Design choices, randomisation of attribute levels

ANSWER STRATEGY: descriptive statistics of AMCE

No counterfactuals. We define the inquiry as a summary of unit characteristics in the population, and not as a function of potential outcomes.

75
Q

EXPERIMENTAL DESCRIPTIVE

What are Conjoint experiments? What is satisficing and what issues does it create?

A

Conjoint experiments aim to describe preferences in a hypothetical scenario (no counterfactual). Researchers create profiles with different feature, show comparison and then ask people to rate/choose them.
It studies the Average Marginal Component Effect (ACME, the average difference in one unit attribute averaging over all the levels of the other unit attributes.

Satisficing means that respondents choose the first option that meets their minimum requirements which can lead to bias results.

Masking happens when an important attribute is left out of the experiment, but satisficing will make it worse.

76
Q

What is covariate adjustment in two arm randomised experiment design?

A

Covariates are measurable characteristics of the participants that are related to the outcome but not caused by the treatment
(eg, Age (covariate) affecting voter turnout (outcome), but we want to measure affect of phone campaigns (treatment) on voter turnout)

It can help account for pre-existing differences between treatment and control that might influence the outcome.

Covariate adjustment in two arm randomised experiments is used to increase the precision of the estimated treatment effect, not to reduce bias.
(eg, covariate adjustment can help account for natural variation in voter turnout due to age, so that you can het a more precise estimate of the true effect of the phone banking campaign)

Two types of covariate adjustment methods:
1. Stratified random assignment
2. Statistical adjustment (regression analysis)

The Lin estimator is a specific way to adjust for covariates that is guaranteed to be at least as precise as the unadjusted estimate.

77
Q

What are the four compliance types?

A

Complier (takes treatment when told to take treatment)
Defier (doesn’t take treatment when told to take treatment)
Always taker (takes treatment regardless of assigned or not)
Never taker (never takes treatment regardless of assigned or not)

78
Q

ELEMENTS OF INQUIRY

Explain ATE, CATE, LATE, SATE, ATT, ATU, ITT
Which design uses which?

A

ATE = Average Treatment Effect
Average difference in outcomes between treatment and control. Considers all units in the study
Used in
- experimental causal (block random, stepped wedge, factorial)
- experimental descriptive (audit, conjoint)
- observational causal (instrumental variable)

CATE = Conditional Average Treatment Effect
Refers to the average treatment effect for a specific subgroup which is defined by some condition
Used in
- Experimental causal (subgroup, factorial, randomised saturation)

—– LATE = Local Average Treatment Effect
Refers to the average treatment effect of compliers only
Used in
- Observational causal (Instrumental variables)
- Experimental causal (Encouragement design)

—— ATT = Average Treatment Effect of the Treated
Refers to the average treatment effect of units who received the treatment
Used in
- Observational causal (DiD)

——- SATE = Sample Average treatment effect
Refers to the average difference in outcomes limited to the sample only (internal validity)

ATU = Average Treatment Effect on the Untreated
Refers to the average treatment effect on units who did not recieve the treatment

ITT = refers to the average difference in outcomes between units assigned to treatment and assigned to control, regardless of whether they ACTUALLY received treatment or not. Reflects only the effect of being assigned to a treatment group, not the treatment itself.
Used in
- Observational causal (instrumental variables)
- Experimental causal (encouragement design)

79
Q

EXPERIMENTAL CAUSAL

What are stepped wedge designs? How do they differ from other designs?

A

Treatment is assigned gradually in multiple stages.
Units are divided into groups, treatment is randomly assigned to one group at a time. In each period, a portion recieves the treatment for the first time. Everyone eventually gets the treatment, researchers measure the outcomes for each group.

Unlike two arm designs, stepped wedge designs gradually rolls out treatment
Unlike DiD, stepped wedge uses randomisation to mitigate bias.