1. Introduction Flashcards
Key terms
What are exogeneous and endogeneous variables?
Exogenous variables: Variables whose values are determined outside the model and are not affected by other variables within the model - considered independent.
Endogenous variables: Variables whose values are determined within the model and are influenced by other variables within the model - considered dependent.
What is an identification strategy?
The approach or method used to establish a causal relationship between variables.
What is causal identification?
**Causal identification ** is the theoretical process of determining whether a specific research design or empirical strategy can correctly estimate a causal effect.
What are randomized experimental designs?
Designs where subjects are randomly assigned to treatment and control groups. This random assignment ensures that the treatment is exogenous, meaning any differences observed between groups can be attributed to the treatment rather than other variables.
What are quasi-randomized/quasi-experimental designs?
Quasi-experimental designs resemble randomized experiments but lack random assignment - a process outside the researcher assigns treatment/non-treatment.
What are quasi-exogenous variables?
Variables that are not strictly exogenous in the pure theoretical sense but are treated as exogenous under certain conditions or approximations in empirical research.
What is temporal precedence?
Temporal precedence is a criterion for establishing causality, where the cause must occur before the effect in time.
If x is the cause of y, then x must precede y in time.
What is time series data?
Time series data contains observations of a variable over time, typically at regular intervals (daily, monthly, yearly) - shows variation across time.
What is cross-section data?
Cross-section data contains observations at a single point in time across multiple units (individuals, firms, countries) - shows variation across units.
What is panel data?
Panel data contains observations of multiple units over serveral time periods - shows variation across units and time.
What are hierarchical models?
Hierarchical models (also called multilevel models) are statistical models used to analyze data organized into multiple levels or groups, such as students within schools, patients within hospitals, or employees within companies.
They allow us to study how factors at different levels (e.g., individual and group characteristics) influence outcomes, while accounting for relationships within and between groups.
What are nested data models / mixed models / random effects or
parameter models? (!)
Nested data models: Models that have nested structures in data (e.g., students within schools or repeated measures within individuals).
Mixed models: Models that include both fixed effects (effects that are constant across all groups/levels) and random effects (effects that vary across groups or levels).
Random effects models: A subset of mixed models where the focus is primarily on random effects. It assumes that individual-specific effects are random and uncorrelated with the explanatory variables. This approach is often used in panel data analysis