week 7 Regression Flashcards
The Regression Equation
Y^=bX + a
Where Y^=predicted or estimated Y
b=the slope
a=the intercept
It is still very important to remember that the regression equation has been derived by fitting the data to an equation, and is not 100% accurate. Nor is there a cause and effect, but only still a correlation.
Other situations may be better served with more complex equations such as when the data appears more curvilinear.
The question is not whether a straight line can be drawn on the data, but how accurate it is and how representational it is to use as a predictor.
b (the slope)
The slope of the regression line tells us the degree of difference in the predicted score(Y), which is associated with a one unit difference in the predictor (X) variable.
b=COVxy/S2x
a (the intercept)
The intercept tells us the value of Y, when X=0. Sometimes there are situations however where to have X=0 is nonsensical.However, the intercept must still be calculated in order to obtain the regression equation.
Centre-ing
The data is artifically centred around the mean, by subtracting X- from every X value. The slope remains the same.
Smoothing
Smoothing techniques are used to average the Y values closer to the target value of the predictor.
Standard error of the Estimate
If we recall that we previously determined with 2 variables, how much variation in one could be explained by the other variable
(r2), (and therefore that some variation occurred due to other factors), it is logical that for any Y^ there is some degree of error.
Thus, SE=Sy[square root of (1-r2)]
where SE =standard error of the estimate
and Sy=standard deviation of Y.
The Standard Error tells us on average, how much our prediction is “off” by.
Confidence limits on the prediction
CIy=Y^ +/- (ta/df)(SE)
Where CIy= the confidence limits around the predicted Y value. (usually confidence limts are set at 95%)
t(a/df) is the tcritical with a =alpha (usually 0.050) and df=degrees of freedom (N-2)
and SE is the standard error of the estimate.
Thus, given X, we can predict Y^ , and Y^ +/- (tcritical)(SE)
gives us the 95% confidence range with which we believe the Y score will be, given X.