L3 - Estimation of Regression Parameters Flashcards
How can the bivariate linear regression model be wrote?
Yi=α + βXi + ui
i = 1, K, N
- X is the independent or explanatory variable
- Y is the dependent or explained variable
- u is a random error or disturbance
- α and β are parameters which characterise the relationship between Y and X. The parameters are not observable directly.
Why is regression analysis useful?
- Regression analysis is the most important tool which economists use to quantify their models.
- Economic theory provides explanations of linkages between variables of interest e.g. the relationship between consumption expenditures and disposable income.
- However, theory rarely gives precise values for the size of the response of one variable to another. For this we must turn to econometrics and, in particular, to regression analysis.
- The regression model provides a mechanism by which the response of one variable to another can be quantified and evaluated from a statistical perspective.
- It therefore acts as one of the key items in the toolkit of the applied social scientist and the objective of this chapter is to discuss how it can be used sensibly in the investigation of economic relationships.
What are two interpretations of the regression model?
1 - The X values are chosen by the investigator e.g. by a process of experimentation.
- In this case the X variable is not random and can be treated as being ‘fixed in repeated samples’ 2
- The X and Y variables are jointly distributed random variables with cov(X,Y) ≠ 0 (covariance)
- This is more realistic for economic data but harder to deal with when deriving the distribution of estimators
Who solved the issue of solving linear regression?
- Mayer’s (1750) solution. Form linear combination of equations to reduce number of equations to number of unknown coefficients.
- He would write out each value of x and y and its corresponding algebraic equation
- each had the variable is an estimation of the regression line (α(hat) and β(hat))
- He would then take an average of these to reduce the number of coefficient and solve ( in this example simultaneously)
- These estimates are unbiased estimates of the population parameters.
- However, there are an infinite number of linear combinations which are consistent with this procedure.
What is the Method of Least Squares?
- An estimator is a rule for calculating an estimate of an unknown value using observable data. Mayer’s method gives us a possible estimator but this is not unique.
- An alternative method is to choose estimates of the parameters which minimises the residual sum of squares:
min{α(hat),β(hat)} RSS = ΣNi=1(Yi -α(hat)- β(hat)Xi)2
- This is the least-squares estimator or, as it sometimes referred to, the ordinary least squares (OLS) estimator.
- OLS provides a simple method for the generation of such estimates which, under certain assumptions, can be shown to have the desirable properties that the estimates are both biased and efficient (in the sense that they have the lowest possible variances in the class of unbiased estimators) –> lowest value of d
Who introduced the Method of Least Squares?
This method was first introduced by Legendre in 1805. It improves on Mayer’s method because the variance of the parameter estimates is the lowest possible.
What are the least-squares normal equations?
ADJUST
- Minimising the residual sum of squares yields the following pair of equations known as the least-squares normal equations.:
α(hat)N +β(hat)ΣXi = ΣY
α(hat)ΣXi+β(hat)ΣXi2=ΣXiYi
where i = 1,…,N Solving these equations yields the least-squares estimates:
α(hat)= Y(bar) -β(hat)X(bar)
substituing this into the second equation above gives:
β(hat)= (Σ_i=1^N(X{i}-X(bar))(Y{i} -Y(bar)))/(Σ_i=1^N(X{i}-X(bar)^2)
OR
β(hat) = cov (X,Y)/var(X)
How can the OLS estimates be calculated?
- Calculate the slope coefficient as the ratio of the sample covariance of X and Y to the sample variance of X –> solve for β(hat)
- Calculate the intercept using the property that the regression line passes through the sample means of the data (X(bar) and Y(bar) –> solve for α(hat)
What is the difference between α/β and α(hat)/β(hat)?
- α and β are population parameters - 𝛼(hat) and 𝛽 (hat) are estimators of the population parameters based on sample data. - The estimators are random variables because they are constructed from the random variables Y and (possibly) X. - The population parameters are not random variables. They are unknown/unobservable parameters which we must estimate using the data available.
What do the different parts of the OLS estimators mean?
- Mean for X{i} = X(bar) = Σ_i=1^N(X{i})/N) - Mean for Y{i} = Y(bar) = Σ_i=1^N(Y{i})/N) - Deviations of X from mean = (X{i}-X(bar)) ∀i - Deviations of Y from mean = (Y{i}-Y(bar)) ∀i - Squared Deviations of X from mean = (X{i}-X(bar))^2 ∀i - Squared Deviations of Y from mean = (Y{i}-Y(bar))^2 ∀i
What is Maximum Likelihood?
- The method of maximum likelihood is an alternative way to generate estimates of the unknown parameters. It begins by making an assumption about the distribution of the errors.
Y{i}=α + βX{i} + u{i}
u{i}~N(0,σ{u}^2) E(u{i},u{j}) = 0 ∀ i≠j
- The errors are assumed to be independent, identically distributed (iid),normal random variables - if data the data collected is (iid) then it is said to be a random sample
- In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a statistical model given observations, by finding the parameter values that maximize the likelihood of making the observations given the parameters
- e.g. if we had a set of data which is normally distributed - what values of μ and σ2, is most likely responsible for creaying the data points that we observed
What is the PDF for the errors in the Maximum Likelihood model?
f(u{i})= (1/sqrt(2πσ{u}^2) * exp((-(Y{i}-α -βX{i})^2))/(2σ{u}^2)))
What is the likelihood function?
L(α,β,σ{u}^2)= Π_i=1^N (1/sqrt(2πσ{u}^2) * exp((-(Y{i}-αβX{i})^2))/(2σ{u}^2))) - this shows the joint probability of the errors in PDF form - Taking logarithms of this gives us the log-likelihood function
What is the log-likelihood function?
LL(α,β,σ{u}^2)= -(N/2)Ln(2π) - (N/2)Ln(σ{u}^2) - Σ_i=1^N(Y{i}-α -βX{i})^2))/(2σ{u}^2)
What is the method of maximum likelihood involves?
The method of maximum likelihood involves choosing estimates of the population parameters which maximise the log-likelihood function. - The first order conditions for a maximum are:
- dLL/dα = 1/2σ{u}^2*Σ_i=1^N(Y{i}-α -βX{i}) = 0
- dLL/dβ =1/2σ{u}^2*Σ_i=1^N(Y{i}-α -βX{i}) = 0
- dLL/dσ{u} =-N/2σ{u}^2 1/2(σ{u}^2)^2*Σ_i=1^N(Y{i}-α -βX{i})2 = 0