Introduction to Linear Regression - Reading 4 Flashcards
What is a dependent variable?
the variable whose variation is explained bu the independent variable
What are the six main assumptions underlying a simple linear regression?
- A linear relationship exists between the dependent and the independent variables.
- The independent variable is uncorrelated with the residuals.
- The expected value of the residual term is zero.
- The variance of the residual term is constant for all observations.
- The residual term is independently distributed; that is, the residual for one observation is not correlated with that of another observation
- The residual term is normally distributed.
What is an independent variable?
the variable used to explain the variation of the dependent variable
Which method is used to estimate a simple linear regression?
Minimizing the sum of squared errors
(Yi-^b0-^b1Xi)^2
How to estimate b0?
^b0=Ym-^b1Xm
How to estimate b1?
^b1=Cov(x,y)/Var(x)
What is important to verify before any conclusions about the coefficients?
Determine the statistical significance
What is the standard error of estimate (SEE) and how to calculate?
Measures the degree of variability of the actual y-values relative to the estimated Y- values from a regression equation
SEE=(SSE/n-2)^0,5
SEE=(MSE)^0,5
What is the coefficient of determination and how to calculate?
tha percentage of the total variation in the dependent
variable explained by the independent variable
R^2=[(TotalVariation-UnexplainedVariation)]/Total Variation=ExplainedVariation/Total Variation
What is the regression coefficient confidence?
Hypothesis testing for regression coefficient may use the confidence interval for the coefficient being tested.
On regression coefficient intervals, what is the appropriate number of degrees of freedom?
n-k-1
Formulate a null and alternative hypothesis about a population value of a regression coefficient
t=(^b-b_hypothesis)/Var(^b)
decision rule : reject H0 if t>tcritical or t
What the rejection of the null means?
the rejection of the null means
that the slope coefficient is different from the hypothesised value of b
What is the confidence interval for the predicted value?
ˆY+-(t_c x sf)
how to calculate the variance error of the forecast?
sf^2=SEE^2{1+[1/n]+[(X-Xm)^2/(n-1)s_x^2]
What is ANOVA?
Analysis of variance (ANOVA) is a statistical procedure for dividing the total variability of a variable into components that can la attributed to different sources
What is the Total Sum of Squares (SST)?
Measures the total variation in the dependent variable
SST=sum(U-Ym)^2
What is the Regression Sum of Squares (RSS)?
Measures the variation in the dependent variable that is explained by the independent variable
RSS=sum(Y^-Ym)ˆ2
What is the decomposition of total variation?
SST=RSS+SSE
What is the F-test? How to calculate the F statistic? How many df are in the numerator?How many df are in the denominator? Is it a two tailed or one-tailed?
An F-test assesses how well a set of independent variables, as a group, explains the variation in the dependent variable. F=MSR/MSE=(RSS/k)/(SSE/n-k-1) df_numerator=k df_denominator=n-k-1 *this is always a one-tailed test Decision rule: Reject h0 if F>F_c
What is another way to calculate the F statistic more easily for a simple linear regression?
F=(t_b1)^2
What are the three main limitations of regression analysis?
The main limilations of regression analysis include the following :
- Parameter instability (especially when dealing with economic and financial variables).
- The limited usefulness of regression models in identifying profitable investment strategies based on publicly available information
- The possibility of violating the assumptions underlying regression analysis (heteroskedasticity and autocorrelation)