Lecture 6: Predictive Data Analysis Flashcards
Before doing regression analysis, we can use a ? -graph plot to determine if the two variables have any linear relationship.
Before doing regression analysis, we can use a scatter-graph plot to determine if the two variables have any linear relationship.
Ordinary Least Squares (OLS):
Minimise the ?? to find the best fit line.
With assumptions, OLS provides the ? (?), ?, ? Estimates (BLUE) of the model parameters (e.g. the betas)
Ordinary Least Squares (OLS):
Minimise the error variance to find the best fit line.
With assumptions, OLS provides the Best (consistent), Linear, Unbiased Estimates (BLUE) of the model parameters (e.g. the betas)
Assumptions of OLS:
- ?
- E(ε) = ?
- ? / Exogeneity: E(ε|X) = ?
- No measurement errors in X
- Homoskedasticity: Var(ε) = ?
- No ?: cov( ε_i, ε_j) = ?
Assumptions of OLS:
- linearity
- E(ε) = 0
- Independence/ Exogeneity: E(ε|X) = 0
- No measurement errors in X
- Homoskedasticity: Var(ε) = σ^2
- No autocorrelation: cov( ε_i, ε_j) = 0
OLS Method:
Minimise ????
Equation?
OLS Method:
Minimise sum of squared deviations:
Equation: https://images.app.goo.gl/ZeCmejWxkk7Jqb8r6
OLS Method:
How to minimise sum of squared deviations?
Taking ?? and setting them to ?.
OLS Method:
How to minimise sum of squared deviations: Taking partial derivatives and setting them to 0.
https://images.app.goo.gl/Va4P582A3fDjFB3A9
To check model fit, use ??? (i.e. ?-?):
proportion of change in ? that can be explained by change in ?
? <= R-squared <= ?
To check model fit, use Coefficient of Determination (i.e. R-squared):
proportion of change in Y that can be explained by change in X
0 <= R-squared <= 1 (100%)
To test parameters’ significance, use ? testing & ?statistics:
Significance if ?? > critical value
To test parameters’ significance, use hypothesis testing & t statistics:
Significance if t-value > critical value
Panel Data: ?? data (e.g. industries, firms,…) + ?-series data
Panel Data: cross-sectional data (e.g. industries, firms,…) + time-series data