Frosi Flashcards

1
Q

Which are the 3 Information Content level of Analyses?

A

Correlations: This level involves identifying relationships where variables move together, either in the same or opposite directions. An example given is the correlation between ice cream consumption and crime rates.

Causation: At this level, variables are linked through cause-and-effect relationships, providing actionable information. This type of analysis often involves interventions, randomized controlled trials (RCTs), or A/B tests. An example is the causal relationship between aspirin use and the reduction of migraine pain.

Mechanisms: The most in-depth analysis involves experiments that reveal the underlying mechanisms of observed phenomena, leading to a broader understanding of laws or regularities across different contexts. For instance, it’s not just that aspirin reduces migraines, but the mechanism involves the dilation of blood vessels, which increases blood flow and affects various types of pain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which are the 3 types of experiments we can have?

A

Randomized Experiments: These are controlled, transparent experiments where subjects are randomly assigned to treatment or control groups. This randomization ensures that the treatment is uncorrelated with other potential confounding variables. Examples include clinical trials in life sciences and field experiments in social sciences.

Natural Experiments: These rely on naturally occurring events or treatments that are effectively random, such as natural disasters or policies randomly assigned by institutions. Examples include the cholera outbreak in London and the Vietnam draft lottery.

Quasi-Experiments: These are experiments where the treatment is assigned by near-random processes, but not through an explicit random assignment by the researcher. Examples include changes in institutional rules that affect subjects differently, like U.S. congressional districts where a candidate narrowly wins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which are the 3 critical assumptions necessary for the causal inference of treatment effects?

A

Random Assignment: This assumption ensures that the assignment of individuals to treatment or control groups is independent of any potential outcomes. This helps establish the causality by eliminating selection bias.

Excludability: Also known as the “exclusion restriction,” this assumption posits that only the treatment influences the outcomes, and all other factors are excluded. This assumption can fail if there are confounding factors (other variables that affect the outcome and are associated with the treatment) or asymmetries in measurement (differences in how outcomes are measured across participants).

Non-interference: Also referred to as the Stable Unit Treatment Value Assumption (SUTVA), it means that the treatment of one unit does not affect the outcomes of another unit. There should be no spillover effects or interactions between units that could confound the treatment effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which are the 3 experiments in which the treatment variable D can be considered exogenous?

Recall the 3 assumptions

A

recall the 3 assumptions: random assignment, excludability, SUTVA

A) In a Randomized Experiment, these three assumptions are typically satisfied by design.

B) In Natural or Quasi-Natural Experiments, these assumptions may hold due to the random-like nature of the treatment assignment, such as when using instrumental variables.

C) With Observational Data, these assumptions do not naturally hold, and researchers must employ robust identification strategies (like matching, difference-in-differences, etc.) to claim causality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

1) Which are the 4 types of random assingments?

2) Why randomization is necessary?

3) How do we check for covariate balance?

A

1) Question:
1. Simple randomization: divide the sample in 2 groups based on a straightforward decision rule (head or tail, random number generator)
2. Stratified randomization: divide the sample by strata (e.g., young vs. old) and then randomly allocate individuals into treated and control groups within the strata
3. Paired randomization: pair units together and randomize within the pairs (extreme case of stratification – e.g., 1 control + 1 treatment paired)
4. Clustered randomization: partition the population into clusters (as in stratified) but assign treatment to random clusters rather than to random units within clusters (e.g., classrooms)

2) Question:
The randomized assignment of experimental units to conditions ensures that groups are balanced on covariates that could affect the outcome. Our two groups thus should have same average levels for all possible other variables we can study: same average age, same gender distribution…

Why do we want balanced groups?
Because our aim is to “isolate” the average effect of the treatment! The only thing that differs between the two groups is whether the units have received the treatment or not

3) Question:
To check for covariate balance, we usually compare averages of the variables with a t-test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the advantages of a Random Assignment?

A
  • Groups are equal on expectations on relevant variables at pre-test (i.e., before treatment)àrun some tests to see if randomization has worked!
  • Alternative causes are not confounded with the treatment
  • Confounding variables are unlikely to be correlated with the treatment
  • Error terms are uncorrelated with treatment variables: in an experimental design, random assignment is used to distribute both observed and unobserved variables evenly between the treatment and control groups.
  • The selection process is known and can be modeled
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can we establish causality with Natural experiments?

List the techniques to create a valid control group

A

To establish causality, it’s necessary to construct a reasonable control group using methods such as:
* Matching: Pairing subjects in the treatment group with similar subjects in the control group.
* Synthetic Controls: Creating a composite control group that approximates the characteristics of the treatment group.
* Differences-in-Differences (DiD): Comparing the changes in outcomes over time between the treatment and control groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can we establish causality with Quasi experiments?

A

These experiments involve conditions where the assignment is not random. There are two main types of assignment:
* Self-selection: Subjects choose for themselves whether to be in the treatment group (e.g., enrolling in a program).
* Non-random selection: An administrator makes the selection (e.g., a school principal assigning teachers to classes).

Because selection is not random, it’s not guaranteed that the treatment and control groups are equivalent, and the observed effects might have alternative explanations.

To address these issues, researchers need to apply theory and logic, along with econometric techniques, to rule out these alternative explanations and attempt to isolate the effect of the treatment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which are the 2 types of validity

A

With regards to experiments, two main validity types (Shadish et al., 2002):
* Internal Validity: is the effect we find really causal (i.e., Does the observed covariation between the treatment and the outcome result from a causal relationship)?

  • External Validity: would the results hold in other settings (i.e., Would the causal relationship between X and Y hold for different people, treatments, outcome measures, and settings)?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Even if we solved Selection Bias with Randomization, which are the 8 other issues that could arise even if we performed a correct randomization?

A
  1. History
    External events occurring concurrently with treatment could cause the observed effect
    Example: we run a study to understand consumer preferences with a specific market segment at the time a new product comes out on the market à it is likely that our results will be influenced by this new entry (even if mitigated by randomization)
  2. Maturation
    Naturally occurring changes over time could be confused with a treatment effect
    Example: people get better over time not because of a treatment (medicine), rather because they get better by themselves. This is also mitigated by randomization
  3. Attrition and Mortality
    Loss of respondents to treatment or to measurement can produce artificial effects if this loss is systematically correlated with certain traits.
    Example: if one treatment (e.g., a course) is too difficult, we might see a higher dropout rate than in the control for people that had lower pre-entry scores à this means that we might end up overestimating the effect of the course when we analyze the post-course results, since we only have data about people that stayed in the program!
  4. Testing
    Exposure to a test can affect scores on subsequent exposures to that test, an occurrence that can be confused with a treatment effect
    Example: taking GMAT multiple times.
  5. Instrumentation
    The nature of a measure may change over time or across conditions in a way that could be confused with a treatment effect
  6. Spillovers/ Contamination
    When the treatment has some side effects also on the control group
  7. Partial Compliance
    Only a fraction of individuals who were offered the treatment might actually take it or “absorb” it (or some members of the control group might manage to get the treatment)
  8. The Hawthorne and John Henry Effects
    The mere fact of being under evaluation forces subjects to change their behavior
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When we can claim external validity?

A

To claim that our results have external validity, meaning that they can be generalized to the whole population, we ideally need a random sample from that population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between Lab Experiments and Field Experiments?

What are the implication for validity in the 2 cases?

A

Lab Experiments:
* Characterized by high control over variables with the only variation being the treatment itself.
* Common in natural and life sciences, e.g., experiments with mice, where all conditions are kept constant except for the treatment. In social sciences, participants are often students or people who know they are part of a study.
* Individuals are aware that they are being observed by researchers.

Field Experiments:
* Occur in less controlled but more realistic environments.
* Derived from agricultural sciences but also used in social sciences with participants like legislators, managers, entrepreneurs who continue their daily activities during the study.
* Designed to be as unobtrusive as possible, often with participants unaware they are part of a study.

Implications for Validity:
Internal Validity Concerns: In field experiments, it is challenging to control for all external variables that participants might encounter in their normal environments. This makes it harder to establish a causal link between the treatment and the outcome.
External Validity Concerns: Lab experiments raise questions about how well the results can be generalized to real-world settings since the conditions are highly controlled and may not reflect the complexity of the outside world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Should we use pre-treatment values of covariates or post-treatment values of covariates?

A

Ideally, we add pre-treatment values of covariates: we do not want factors that can be affected by the treatment!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Key Assumption behind Difference in Differences?

A

The key assumption we are making is that treated units would have experienced the same change in mean outcomes over time as that actually observed among the untreated units

In cases of no-randomization, it is easy to include controls that help mitigate concerns of omitted variables. Be sure not to include variables that are themselves an outcome of the treatment: ideally, include pre-treatment (stable) characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does fixed effects works in DID?

A

Fixed Effects are nothing more than dummies that we add to our regression when we are in a panel setting (i.e., repeated observation over time for multiple units)
We control for time and individual differences by adding individual and time dummies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can we reduce standard errors of ATE?

A
  • Precisely measure outcomes.
  • Conduct pre-post (before and after treatment) experiments rather than post-only, as changes over time (delta scores) usually exhibit less variance.
  • Increase the number of subjects, especially in groups where high variability is expected.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can we assess the uncertainty around the ATE from a single experiment with one randomization through Randomization Inference?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is and How does Randomization Inference work?

A

Randomization Inference is The calculation of p-values based on an inventory of datasets
coming from a simulation of all possible randomizations
.

Randomization Inference allows for testing the sharp null hypothesis, which states that there is no treatment effect for all observations (i.e., the outcome would be the same with or without the treatment).

Simulation of Randomizations: It involves simulating all possible randomizations of the treatment across observations to create a sampling distribution of the Average Treatment Effect (ATE) under the sharp null hypothesis.

Probability Calculation: From the simulated distribution, the probability of obtaining an estimated ATE as large as the observed one can be calculated, assuming that the true treatment effect is zero (ITE_i = 0).

Large Number of Randomizations: When there are many possible ways to randomize, the sampling distribution can be approximated.

Calculation of P-Values: The p-values, which help determine the significance of the results, can be calculated based on this simulated distribution. This process is termed Randomization Inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a Mediation analusys (related to a mechanism, that is the step after causal inference)?

A

Mediation analysis aims to identify the pathways—the mediators—through which the treatment affects the outcome. Thus, understanding the mediators is crucial for unpacking the causal chain from treatment to outcome, and for possibly finding more efficient or targeted interventions based on those mediators.

It focuses on two key questions:
* Did the treatment cause a change in the mediator?
* Did this change in the mediator lead to a change in the outcome?*

Example: An example from a sailor study is given, where:
* Treatment: Lime-based diet
* Mediator: Increase in Vitamin C intake
* Outcome: Decrease in scurvy-induced deaths

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How Mediators can be analyzed with a regression?

A

Mediation Components: It breaks down the total effect of a treatment into two parts: the direct effect and the indirect (mediated) effect. The total treatment effect is the sum of these two components.

  • Direct Effect: The effect of the treatment variable on the outcome variable that is not transmitted through the mediator.
  • Indirect Effect: The effect that is transmitted from the treatment to the outcome through the mediator.

Regression Equations: It describes a three-equation system used to quantify these effects:
* The first equation models the mediator as a function of the treatment.
* The second equation models the outcome as a function of the treatment.
* The third equation models the outcome as a function of both the treatment and the mediator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the main challenges and considerations in establishing causal mediation?

A

1) Problem with Regression-Based Analysis: It notes that a significant issue with such analysis is that the mediator variable, Mi, is often not manipulated independently through a randomized intervention, which can lead to concerns about the validity of the mediation effect.
* Solution: To address this, the slide suggests manipulating the treatment variable Di and the mediator Mi through separate random assignments. For example, in a nutritional study, Di could be the presence or absence of limes in the diet, while Mi could be the presence or absence of vitamin C, regardless of its source.

2) Challenges in Implementation: Despite this theoretical approach, the slide acknowledges the difficulty of implementing such designs in real economic settings due to the complexity and multi-dimensional nature of economic behaviors and outcomes.

3) Warnings:
* Specificity of Manipulation: Ensuring that the treatment only manipulates the mediator of interest and not other potential mediators is challenging.
* Random Allocation: Often, mediators are not randomly allocated, which can bias the estimation of mediation effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a possible solution to the challenges of causal mediation analysis?

A

The proposed solution is “Implicit Mediation Analysis,” where researchers adjust elements of the treatment to indirectly measure the effects of mediators. Instead of directly testing how changes in the mediator influence the outcome, researchers look at how different aspects of the treatment influence one or more mediators. In other words, the focus is not on studying how 𝐷”-induced change in 𝑀” influences 𝑌”, but rather on the relative effectiveness of different classes of treatments whose attributes affect 1+ mediators along the way

Example:

Research Question: how do conditional cash transfers (𝐷”) impact schooling enrollment (𝑌”) of low-income children?
* Conditional cash transfers = government payments to low-income families who agree to keep their children enrolled in schools
* Experimental Evidence: conditional cash transfers (𝐷”) lead to improved educational outcomes (𝑌”) for children in developing economies

Potential Mediators (𝑴𝒊) ?
* Increased cash (𝑀#”): by providing cash to families, they can invest it in their children’s education
* Increased conditions ( 𝑀$” ): by conditioning the cash payment on the schooling requirements, families exert greater effort in the children’s schooling

these two mediators have been investigated by assigning families to one of 3 experimental groups
1. Control group: no cash or instructions from the government
2. Treated group 1: cash without conditions
3. Treated group 2: cash with conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Why is it difficult to establish causal mediation in social sciences?

A

Because it is difficult (/ impossible) to independently allocate both the treatment and the mediator to units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the definition of compliance and non compliance?

A

The term “compliance” is used to describe whether the actual treatment coincides with the assigned treatment, i.e:
* Actual Treatment = (Randomly) Assigned Treatment

Non-compliance: experimental units do not follow the randomized treatment: assignment
* Some units assigned to the treatment group refuse to take the treatment
* Some units in the control group manage, in some ways, to receive the treatment

25
Q

What are the 2 types of non-compliance and why non-compliance can happen?

A

Two situations:
1. One-sided non-compliance (or Intention-To-Treat): some units that should be treated fail to receive the treatment (but control units do not receive the treatment)
2. Two-sided non-compliance: we have both cases, with some treated units not receiving the treatment and some control units instead receiving it

Failure-To-Treat can happen for a number of reasons:
1. Logistical issues (e.g., miscommunication, transportation problems…)
2. Treated subjects difficult to reach (e.g., subjects not showing up)
3. Treated subjects refuse the treatment (e.g., subjects refuse to attend a training or to take a pill)

26
Q

How can we analyze data coming from experiments with issues of non- compliance?

A
  1. Ignore the actual compliance status and analyze the units given the assigned treatment -> Intention-to-Treat (ITT)
  2. Analyze the results given the actual treatment -> **wrong! **
  3. Analyze the subpopulation of “compliers” -> IV strategy
27
Q

What is the difference between a straighforward and a less clear non compliance? give 2 examples

A

Non-compliance can be:
* Straightforward in some cases: in a study designed to test the efficacy of a vaccine, subjects are either injected with the vaccine or not
* Less clear in other cases: what if the treatment is a year-long training program, but some subjects only attended a few months?

28
Q

Explain the ITT (Intention To Treat) and when it is equal to the ATE

A

ITT shows the change in the outcomes given the original (intended) treatment assignment.

ITT = ATE when we have full compliance (100%)

  • If the focus is on whether the program ”made a difference”, non- compliance is irrelevant: the focus is on whether the program generated a change in the outcomes, regardless of how many units were treated
  • If we want to know the magnitude of this effect, then we may end up under-/ over-estimating it if we disregard compliance status

Most of the times, we want to estimate the Average Treatment Effect, NOT the Average Effect of Assignment to Treatment (want to know the average effect of 𝐷” on 𝑌”, not the average effect of 𝑍” on 𝑌”)

29
Q

What is the LATE (or CACE) and what are the 2 methods avaiable to compute it?

A

Compliers Average Causal Effect (CACE) = Local Average Treatment Effect (LATE) on compliers

E[Yi(1) - Yi(0)|D:(1) - Di (0) = 1]

  • In principle, we can retrieve the CACE / LATE by dividing the ITT (effect of treatment on outcome) with the proportion of compliers in the sample:
    ITT
    ——— = CACE
    num. compliers
    / num. tot subjects
  • More commonly done with an Instrumental Variable (given the assumptions + randomization of Z): Can be accomplished with a 2SLS (Two-stages-least-squares) estimation.
    Idea:
    𝑇𝑟𝑒𝑎𝑡𝑒𝑑” = 𝛼&+ 𝛼’𝐴𝑠𝑠𝑖𝑔𝑛𝑒𝑑” + 𝑒”
    𝑌” = 𝛽&+ 𝛽’𝑇𝑟𝑒𝑎𝑡𝑒𝑑” + 𝑢”
30
Q

With non-compliance, What if we end up with a lot of non-compliers and these are very different compared to compliers?

Can I move non-compliers to the control group?

there are 2 methods

A
  1. If non-compliers are many, the estimates might be quite imprecise (denominator of CACE becomes very small)
  2. No! For 2 reasons:
    This can lead to a severe over-estimation of the treatment effect;
    Moreover, it invalidates the randomization (because non-compliers may systematically be different from compliers);
31
Q

What if some subjects are only “partially treated”? What are the 3 possible solutions?

A

We would have: 1) Compliers & Partial Compliers in the treated group and 2) Never-Takers in the control groupàcannot estimate 3 effects if we only have 2 randomly assigned groups

Solution: To estimate the average causal effect of partial treatment, we need an experimental design that varies whether subjects are encouraged to receive partial or full treatment - > have multiple treatment groups

In the absence of this:
* Consider partially treated subjects as fully treated. Usually, we would expect the effect of partial treatment to be smaller than the effect of full treatment
* Scale the treatment according to how much is delivered and assign “partial credit” for partial treatments - a person who attends 3/4 of the sessions might be classified as 𝐷” = 0.75

32
Q

What is the assumption that helps us to manage the Two-sided non-compliance?

A

The monotonicity assumption rules out the presence of Defiers (assigned to the control group but treated OR assigned to the
treated group but untreated)

33
Q

What is the difference between ATE and HTE (Heterogeneous treatment effect) and why sometimes is necessary HTE instead of ATE?

A
  • We can analyze this heterogeneity by adding an interaction to our regression model: 𝑌𝑖=𝛼+𝝆D𝑖+𝜆𝐹𝑖+𝛿𝐹𝑖∗D𝑖+𝜖𝑖
  • Because It is not plausible to believe that every observation (subject) responds to the intervention in the exact same way
34
Q

Can we explore whether there is heterogeneity ex-post?

A

We could obviously run a dozen of regressions with interactions and our covariates of interest, however this could become quite unstable (due to model misspecifications) and long

High probability of finding something significant simply by chance!

35
Q

What are the 2 Two main ways to study HTEs? Explain how they work

A

Treatment-by-covariate interactions: often ex-post, EXPLORATORY and NOT causal (or “subgroup analysis”): subjects are partitioned into sub-groups and variation in ATEs from subgroup to subgroup is analyzed

Covariates partition subjects into groups based on subjects’ individual attributes (e.g., age) or on attributes of the context in which an experiment occurs (e.g., season):
𝐵𝑜𝑑𝑦 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒” = 𝛽&+ 𝛽’𝐴𝑔𝑒”+ 𝛽)𝑇𝑟𝑒𝑎𝑡𝑒𝑑” + 𝛽*𝑇𝑟𝑒𝑎𝑡𝑒𝑑” ∗ 𝐴𝑔𝑒” + 𝑢”
We can retrieve two CATEs: one for young individuals and one for older ones

NOTE: subgroup analysis should be thought of as an exploratory or descriptive exercise (not as a causal one…)

Treatment-by-treatment interactions: ex-ante design (2x2 in the case we saw), can be causal (dedicated design): introduce additional interventions in order to assess the variation in ATEs across other randomly assigned assigned treatment conditions.

Causality can be drawn because subjects are randomly assigned to every combination of experimental conditions

Example: Ethnicity Condition and Good/Bad grammar

Factorial Design: This is a setup where more than one treatment is tested in a single experiment. In the provided example, a 2x2 factorial design is used. This means there are two independent variables, each with two levels. The independent variables are the ethnicity of the e-mail author (Caucasian or Hispanic) and the quality of the e-mail’s grammar (good or bad).

Subjects: The subjects in the study are state legislators. They are the ones who are being observed for their reaction to the treatments.

Treatment Variations: There are two treatment variations or independent variables:

The ethnicity of the e-mail author, operationalized through the names Colin Smith (to represent a Caucasian) and José Ramirez (to represent a Hispanic).
The grammar quality of the email (good vs. bad).
Random Assignment: Subjects are randomly assigned to each combination of experimental conditions. This is crucial for causal inference, as it helps ensure that the observed effects are due to the treatments rather than some other confounding factors.

36
Q

What are Power Analysis and Sensitivity analysis?

describe the difference between an ex-ante and ex-post Power Analysis

A

Power analysis: helps determine the appropriate number of experimental units in a sample to detect a treatment effect, if present. The power is the probability of detecting a treatment effect if it exists.

  • A-priori Power Analysis: This type of power analysis is done before the experiment begins (a-priori) to estimate the necessary sample size.
  • Post-hoc Power analysis: These are conducted after the experiment to calculate the achieved power, given the observed effect size, the sample size used, and the significance level. This helps to understand the likelihood that the experiment correctly rejected the null hypothesis.

Sensitivity analyses: These determine the smallest effect size that could have been reliably detected with the given sample size and power. This type of analysis helps to assess the robustness of the experimental findings to different assumptions about the effect size and other parameters.

37
Q

Which are type-I and type-II errors?

A
  • Type I error (false positive): Rejecting a true null hypothesis. Its probability is the significance level (α), often set at 0.05.
  • Type II error (false negative): Failing to reject a false null hypothesis.
38
Q

What is the Effect Size and How can be estimated?

A

Effect Size: This is the observed difference between the treatment and control groups, quantified by measures like **Cohen’s d. **
The effect size reflects the signal (the actual difference due to the treatment) relative to the noise (the variability within the experimental material).

Cohen’s d: This is the ratio between signal and noise, which provides a standardized measure of effect size. It can be calculated based on treatment theory or previous evidence.

39
Q

Which are the variables necessary to determine the correct Sample Size?

A

A-priori Power Analysis: This type of power analysis is done before the experiment begins (a-priori) to estimate the necessary sample size.

Elements Considered:
* The significance level (α), which is the probability of making a Type I error (false positive).
* The desired statistical power (1 - β), which is the probability of correctly rejecting the null hypothesis when it is false, thereby detecting an effect if there is one.
* The estimated effect size, which is the expected difference between the groups that the experiment aims to detect.
* The number of groups involved in the experiment.

Outcome: The information from these elements can be used to calculate the “optimal” sample size for the experiment, ensuring that it is large enough to detect the expected effect but not larger than necessary. This optimization helps in resource efficiency and practical feasibility while maintaining the statistical integrity of the experiment.

40
Q

Describe the key points of Natural Experiments

A

Unlike randomized controlled experiments, natural experiments do not have a treatment assigned in a controlled manner. It is assumed that the assignment in a natural experiment is as-if it was randomized, even though the researcher does not control the assignment process.

Requirements for Natural Experiments:
* An exogenous shock, which is an event that is external to the subjects of the experiment and not influenced by their characteristics. The researcher must convince the reader or audience that the shock was truly exogenous (i.e., it was not influenced by the subjects of the experiment and was not anticipated).
* A proper control group, which is not affected by the exogenous shock, to compare against the treatment group.

41
Q

Describe Quasi-Experiments

A

Do not involve an explicit random assignment procedure – > the causal inferences they produce are subject to greater uncertainty

42
Q

What is the key Assumption in DID analysis?

A

Parallel trends assumption: prior to the treatment, the trend of the outcome variable (not the level, the trend) has to be the same for the control and treated units

43
Q

What is Matching and which are the 2 types of matching?

A

Matching i a technique for creating control groups when clear counterfactuals are absent. It’s a method used to generate control groups by selecting units from the control group that are similar to the treatment group. The goal is to ensure that any observed differences in outcomes are attributable to the treatment alone.

Types of Matching:
* Covariate Matching: Matching based on observable characteristics.
* Propensity Score Matching (PSM): Matching based on the propensity score, which is the probability of being treated given a set of covariates.

44
Q

How does Propensity Score Matching (PSM) work?

A

PER L’ESEMPIO VEDI FOGLIO ‘MATCHING” IN CHAT-GPT

Propensity Score Matching (PSM) is a statistical technique used to estimate the effect of a treatment or intervention by accounting for the covariates that predict receiving the treatment.

The Basics of PSM:
In observational studies, you have a group that has received some treatment (the treatment group) and a group that hasn’t (the control group). Unlike randomized controlled trials, these groups may have systematic differences making it hard to attribute differences in outcomes to the treatment.

The propensity score is the probability of a unit (e.g., a person, a school, a city) receiving the treatment given their observed characteristics. It is typically estimated using a logistic regression where the treatment assignment is the dependent variable and the independent variables are the observed characteristics.

Once each unit has a propensity score, units in the treatment group are matched with units in the control group with similar propensity scores. The goal is to make the distribution of observed characteristics similar between the two groups, mimicking a randomized experiment. Several algorithms can be used for matching: Nearest neighbor (Match to the closest propensity score), Caliper matching: (Match within a specified propensity score range), Stratification (Divide scores into blocks and match within blocks).

Assess Match Quality: Check for balance in covariates post-matching. The matched samples should have similar distributions of covariates. Discard units in the treatment group that do not have close matches in the control group.

With the matched sample, compare outcomes between the treatment and control groups to estimate the treatment effect.

45
Q

What are the 7 steps involved in the Matching?

A
  • Decide Matching Technique: Choose between covariate matching or propensity score matching.
  • Estimate Propensity Score: If using propensity score matching, employ a probit or logit regression model to estimate the likelihood of each unit receiving the treatment.
  • Choose Matching Algorithm: Select an algorithm for matching units from the treatment and control groups. Common algorithms include:
    Nearest neighbor: Matches a treated unit with the closest control unit based on propensity score.
    Caliper & Radius: Matches treated and control units within a predefined range or “caliper” of propensity scores.
  • Check Common Support: Ensure that there is an overlap in the propensity scores between treated and control units. Units without overlapping scores are typically discarded because they do not have comparable matches.
  • Analysis: After matching, conduct further analysis to evaluate the quality of the match and the effect of the treatment:
    Perform t-tests to check for differences in covariates between the treated and control groups to ensure that the matching process has balanced the covariates across groups.
    Run regression analyses on the matched data to estimate the treatment effect, and conduct sensitivity analyses to assess the robustness of the results.
46
Q

Nearest-neighbor VS Caliper / Radius MATCHING

A

Nearest-neighbor Matching: This method pairs units with the closest propensity scores. It simply finds the closest match for each treated unit within the control group based on the propensity score.

Caliper/Radius:
Nearest-neighbor (NN) matching might result in poor matches if the closest control unit is still far away in terms of propensity score, which could lead to biased estimates.
To prevent this, a caliper or radius can be set, which is a tolerance level for the maximum acceptable distance between the propensity scores of matched units.
With caliper/radius matching, control units are only considered a match if they are within this predefined range of the treated unit’s propensity score and are the closest in terms of propensity scores.

47
Q

What is an alternative to Matching technique?

A

The use of synthetic controls as a method for creating a control group.

Example: (see also ‘MATCHING’ on CHAT-gpt for another example).

Objective: To estimate the economic effect of terrorism in the Basque Country, a region for which no clear counterfactual exists (it’s not possible to observe the region both with and without the influence of terrorism at the same time).

Synthetic Control Creation: Instead of using a single other Spanish region as a control group, the method improves upon this by using all other regions to create a synthetic control region. This synthetic control is designed to be as similar as possible to the Basque Country in terms of observable characteristics before the onset of terrorism.

Selection Criteria and Weighting:

The regions are chosen based on their real GDP per capita to ensure economic similarity.
Weights are assigned to each region to create a synthetic Basque Country that closely matches the actual Basque Country before terrorism. In this example, the synthetic Basque Country is a weighted average of the regions of Madrid (with a weight of 0.85) and Catalunya (with a weight of 0.15), while other regions are not weighted (weights = 0).
The approach allows for a more nuanced and potentially accurate estimation of the economic impact of terrorism by synthesizing a control region that closely resembles the treated region in key economic and demographic characteristics.

48
Q

when is the LATE equal to ATE?

A
  • When there is no heterogeneity in treatment effects;
  • When there is no heterogeneity in the first-stage response to the instrument;
  • When there is no correlation between the response to the instrument and the response to the treatment.

EXPLANATION:

The LATE being equal to the ATE under certain conditions can be understood through the lens of these conditions:

No Heterogeneity in Treatment Effects: If every individual experiences the same effect from the treatment, then the average effect for compliers (those who comply with the instrument, i.e., those for whom the instrument changes their treatment status) is the same as the average effect for the entire population. In other words, treatment effect is uniform across individuals.

No Heterogeneity in the First-Stage Response to the Instrument: If every individual has the same probability of complying with the instrument (e.g., everyone has the same likelihood of taking the treatment if they are encouraged to do so by the instrument), then the compliers are representative of the entire population.

No Correlation Between the Response to the Instrument and the Response to the Treatment: This condition is critical because if the individuals who are more likely to comply with the instrument (those who are induced to take the treatment due to the instrument) are also those who would benefit more (or less) from the treatment, then the LATE would only capture the effect on this specific subpopulation and not the average effect across all individuals. When there’s no correlation, it means that being a complier does not systematically relate to the potential outcomes from the treatment. That is, those who are induced to take the treatment by the instrument are not inherently different in terms of how much they benefit from the treatment compared to those who would take the treatment anyway or those who would never take it, regardless of the instrument.

In essence, when there is no correlation between the response to the instrument and the response to the treatment, the group of compliers is a random sample of the entire population in terms of treatment effects. Therefore, the average effect observed for the compliers (LATE) is the same as the effect that would be observed if everyone in the population received the treatment (ATE). This condition assumes that the instrument is as good as random assignment to the treatment for predicting who receives the treatment and does not inherently select a particular type of individual who might have a different treatment effect.

49
Q

Why Matching and Synthetic Controls are useful with experiments?

A

These methods are particularly useful in natural and quasi-natural experiments where randomized control trials are not feasible. They help to approximate the conditions of a randomized experiment by ensuring that the treatment and control groups are as similar as possible, thereby allowing for more accurate estimation of the treatment effects.

50
Q

Describe System 1 and System 2 involved in our decision making process

A
  • System 1 is our intuitive system, which is fast, automatic, implicit, and emotional.
  • System 2 is our rational system, which is slower, conscious, effortful, and logical.

System 1 can be sufficient for simple decisions (like choosing cookies), while System 2 might be better for more crucial decisions.

51
Q

What does it mean Heuristic in decision making? What are its adv/cons?

A

Heuristics are quick and fast methods that are not guaranteed to be optimal or rational but are sufficient for immediate solutions.

  • On the positive side: Heuristics reduce mental effort, simplify complex questions, and offer fast and sometimes accurate ways to reach a conclusion.
  • On the negative side: Heuristics can lead to inaccurate judgments, especially if one is not aware of their influence.
52
Q

What is the name of one of the first Heuristic model?

A

The concept of “satisficing,” a term coined by Simon in 1957, is one of the first models of heuristics.

Bounded rationality is presented, where rationality is limited, and agents make decisions within the constraints of cognition, information, and time. Decision-makers, therefore, act under the satisficing principle.

Satisficing involves agents not seeking the optimal solution but rather a satisfactory one that meets a certain threshold or level. Decision-makers aim to make rational decisions but acknowledge the impracticality of achieving a globally optimal solution, focusing instead on a reasonable or acceptable one. The process of satisficing entails searching for a solution that is sufficiently good, rather than the best possible one in all circumstances.

53
Q

Which are the 4 main heuristics?

A
  1. Availability heuristic
  2. Representativeness heuristic
  3. Confirmation heuristic
  4. Affect heuristic
54
Q

List the most common bias

A

Ease of Recal: the process where people assess the frequency or probability of an event by the ease with which instances or occurrences of that event are readily available in memory. This demonstrates how availability can lead to skewed perceptions of frequency or risk. This ease of retrieval can lead to an overestimation of their frequency.

Representativeness Heuristic -> Insensitivity to Sample Size: anothr problem is related to Insensitivity to Sample Size. People often fail to appreciate the role of sample size when assessing the reliability of sample information, which can lead to erroneous judgments (ex. small and big hospitals).

Bonus bias (‘the big data’ bias’): The slide suggests that despite the large number of votes, the sample may not be representative due to these factors (ex. Elon Musk elections).

Misconception of chance: It poses a scenario where you flip a fair coin four times, and the first three flips result in heads. It asks what the best estimate of the probability of getting another head is. The correct answer is 50% because coin flips are independent events

Insensitivity to base rates: It suggests that when assessing the likelihood of events, individuals tend to ignore base rates if any other descriptive information is provided, even if it is irrelevant. a real-world example of this bias, where entrepreneurs may spend too much time imagining success for their startups, despite high base rates of failure, because they focus on overly positive information

Regression to the Mean: The slide suggests that while extreme performances are memorable, they do not typically predict future results, which are more likely to be average.

** The Conjunction Fallacy**: People tend to rank the probability of two co-occurring events (e.g., Lucy being a bank teller and active in the feminist movement) as more likely than a single broader event (e.g., Lucy being a bank teller), which is a statistical error.

Confirmation Heuristic -> Confirmation Bias/Trap: This bias refers to the tendency of people to favor information that confirms their existing beliefs and to disregard information that contradicts their beliefs

Anchoring: Anchoring is demonstrated as a common cognitive bias where people rely too heavily on the first piece of information they receive (the “anchor”) when making decisions. Subsequent judgments are then influenced by the anchor, even if it is unrelated to the decision at hand. The bias is prevalent in various situations, including salary negotiations and initial price offerings, where the first proposed number sets a psychological benchmark. Individuals tend to make insufficient adjustments from the anchor when converging to a final value, often leading to skewed decision-making.

Conjunctive and Disjunctive Events: The slide explains that people tend to overestimate the probability of conjunctive events (where multiple events need to occur) and underestimate the probability of disjunctive events (where only one of several events needs to occur).

Hindsight: This bias occurs when people use their knowledge of an outcome to influence their belief about the likelihood of events leading up to that outcome. It can negatively affect our ability to learn from past experiences and objectively evaluate decisions. Known as hindsight bias, it refers to the tendency to overestimate one’s ability to have predicted an event after it has already occurred.

Curse of Knowledge: The curse of knowledge bias refers to the difficulty people have in imagining not knowing something that they do know.

The Affect Heuristic: Decision-making is often influenced by affective and emotional evaluations, which can be unconscious. People may rely on affect, or their immediate emotional response to a stimulus, as the basis for their decisions, especially when they do not have the resources or time to reflect.

The Certainty Effect: The Certainty effect is highlighted through gambling scenarios. People tend to prefer a certain outcome over an uncertain one, even if the uncertain outcome has a higher expected value.

55
Q

Which are the 3 key elements when analyzing Causality?

A
  • Causal Variable: Also called intervention or treatment. Can be binary, categorical or even continuous (quantitative treatment). The variable that should have an effect on an outcome.
  • Outcome Variable: The variable that should be affected by the treatment.
  • Subjects: Individuals (or households, teams, companies,…) that might be affected (or not) by the treatment
56
Q

Which are the Overconfidence Bias?

A

Overconfidence has been implicated in numerous societal and economic issues, including wars, financial crises, and corporate failures. It is characterized by the belief that one’s views are right, skills are superior, and information is superior to others.

  1. Overprecision - excessive certainty regarding the accuracy of one’s beliefs. A key consequence of overprecision is drawing very narrow confidence intervals and being too certain about knowing the truth
  2. Overestimation - an inflated belief in one’s abilities or the accuracy of one’s thoughts.
  3. Overplacement - the exaggerated belief that one is better than others.
57
Q

how to approach a situation where incomplete information is available and how to apply economic rationality to make decisions?

A

In the absence of any evidence to favor one color over another, it is rational to assume equal likelihoods and base your decisions on that assumption.

58
Q

What is the Ellsberg Paradox?

A

The paradox explains why people might prefer known probabilities (risk) over unknown probabilities (ambiguity). It suggests that people tend to choose a known chance of winning over an unknown probability of winning, even if the known probability is low, because the uncertainty of the unknown is less comfortable. This preference aligns with the “Certainty Effect” described by Kahneman & Tversky.

59
Q

Risk vs Ambiguity vs Uncertainty

A
  • Risk involves situations with known objective probabilities (like flipping a coin).
  • Uncertainty refers to situations where probabilities are not known and cannot be determined.
  • Ambiguity involves known subjective probabilities, where people feel capable of making assumptions about the likelihoods of different outcomes.

Ambiguity is described as involving unknown subjective probabilities. With ambiguity, people have no clear idea on how to form probabilities and must rely on chance or gut feeling, as exemplified by the tennis ball example.
LFHI (a theoretical situation) is mentioned where there’s no known state on which to assign probabilities, demonstrating a lack of knowledge about how to attribute chances to different outcomes.
The slides indicate that when faced with such situations, people use logic or causal links to concentrate probabilities and design experiments to “update our priors” (i.e., to update initial beliefs based on new information).
Overall, the slides aim to explain the behavioral tendency of individuals to prefer situations with known risks over those with ambiguous or uncertain outcomes, and they touch on how these preferences can lead to irrational decisions in the face of ambiguous information.