L10 - (PL) Introduction to Panel Data Models Flashcards
What types of data do we come across more in regression analysis?
- •Time series is a set of observations on the values that a variable takes at different times (daily, weekly, monthly, quarterly, annually, etc.).
• Cross-sections are data on one or more variables collected at the same point in time.
•Pooled data combine both time series and cross-sectional data. (but its different cross-sectional units over time (different firms looked at each different year))
•Panel or Longitudinal data are a special type of pooled data in which the same cross-sectional unit (a family or a firm) is surveyed over time.
* If time series dimension is different for different cross-sectional units we say we have a unbalanced panel
What are the different types of Panel Data we gather?
- Micro panels ((British Household Panel, German Socio-economic Panel);
- British Household Panel –> cover things such as demographic, education, income, marital status
- Collected for a large number of individuals (100,000s) over a short period of time (Min. 2years - Max 10-20years)
- Long cross-sectional dimension but short time series one
- Usually, firms/households are randomly sampled so are unlikely to be correlated
- Collected for a large number of individuals (100,000s) over a short period of time (Min. 2years - Max 10-20years)
- Macro panels (IMF International Financial Statistics, world bank data ).
- IMF International Financial Statistics –> approx. 30,000-time series data coving more than 200 countries starting from 1948 this includes exchange account and the main global country economic indicators
- Number of countries (e.g. 20 OECD countries (not exceeding 100-200 countries) over a period of time (annual over 20-60 years)
- Can also be very high-frequency data like daily observations of a stock index changing over time
- Econometrics techniques problems for macro panels –>
- deal with non-stationarity/unit roots
- cointegration
- cross-country dependence –> there is a likelihood country can be correlated
- IMF International Financial Statistics –> approx. 30,000-time series data coving more than 200 countries starting from 1948 this includes exchange account and the main global country economic indicators
What are the benefits of using Panel data?
- •Controlling for individual heterogeneity.
- Give more informative data, more variability (within states and between them - not just aggregated on a country level), less collinearity among the variables, more degrees of freedom and more efficiency.
- Dynamics of adjustment (change, duration of economic stance, speed of adjustment to economic policy changes, intertemporal relations).
- observes changes
- Identify and measure effects that are simply not detectible in pure cross-sectional and pure time-series data.
- Women have a 50% chance of participating in the workforce
- Does this mean there is a huge turnover or there are 50% of women who work full-time and some not at all –> only panel data can discriminate between these cases
- Women have a 50% chance of participating in the workforce
- Construct and test more complicated behavioural models than purely cross-section and time-series data
- .Biases resulting from aggregation over firms or individuals may be reduced or eliminated (Micro panels).
Controlling for individual heterogeneity: Benefits of using Panel data?
Panel Data suggests that individuals, firms, states and countries of heterogenous –> not controlling for this leads us to get biased results
- Baltagi and Levin (1992)- cigarette demand across 46 American states for the years 1963-1988
- •Function of price and income;
- What else could influence this (potentially hard to measure/observe)
- •State-invariant (cross-sectional unit) variables, i.e. advertising on national TV and radio;
- •time-invariant variables i.e. religion and education.
- It May be hard to get a figure for how many Mormons there are in each state (but will this change a lot over time)
- omission of these variables could lead to biased estimators
- panel data can control for these variables given the fact they are observed or not while time series/cross-sectional results cannot
- UTAH has low smoking rates but that isn’t because of income and price but because it is a Mormon state as is prohibited by their doctrine
Limitations of Panel Data?
- •Design and data collection problem (problems of coverage, nonresponse (not giving a proper or any answer), recall (respondent does not remember things correctly), etc.).
- •Distortions in measurement error.
- Error in test, memory error, deliberate distortion in responses ( individuals don’t want to admit they take drugs)
- Also, inappropriate informants, recording errors and interviewer effects
- •Selectivity problem:
- –Self-selectivity. –> people choose not to work cause their reservation wages > that actual wage –> observe the characteristics of these individuals but not their wage
- As it’s only the wage missing we call the sample censored
- However if we do not observe all the data on these people we call this a truncated sample
- –Non-response (partial and complete).
- Refusal to answer or no one at home etc.
- Partial –> one or more questions are unanswered or fail to provide a useful response
- Both cause efficiency loss and misidentification problems in the population parameters
- –Attrition.
- respondents may die, move away or find the cost of responding too high
- biasing attrition —> those leaving the sample found to have lower earnings, low education levels –> introduces biases into the estimation parameters
- –Self-selectivity. –> people choose not to work cause their reservation wages > that actual wage –> observe the characteristics of these individuals but not their wage
- •Short time-series dimension for micro panels.
- Increases the time span of a panel can lead to increased chances of attrition
- Increases the computational difficulty for limited dependent variable panel models
- Increases the time span of a panel can lead to increased chances of attrition
- •Cross-section dependence for macro panels.
- long time series on countries could lead to cross-country dependency issues