7. Regression Flashcards
Least Squares Regression Equation
Y’ = bX + a
Y’ = the predicted value
X = the known value
a and b = numbers calculated from the original correlation analysis.
b = r √(SSy / SSx)
a = ̅Y – b ̅X
Give the 5 steps for determining the least squares regression equation
- Determine values of SSx, SSy, and r by referring to the original correlation analysis.
- Substitute numbers into the formula and solve for b.
- Assign values to ̅X and ̅Y by referring to the original correlation analysis.
- Substitute numbers into the formula and solve for a.
- Substitute numbers for b and a in the least squares regression equation.
Least Squares Regression Equation
The equation that minimizes the total of all squared prediction errors for known Y scores in the original correlation analysis.
Assume that an r of .30 describes the relationship between educational level (highest grade completed) and estimated number of hours spent reading each week. More specifically:
Educational Level (X): ̅X = 13 SSx = 25
Weekly Reading Time (Y)
̅Y = 8
SSy = 50
r = .30
Determine the least squares equation for predicting weekly reading time from educational level.
b = r √(SSy/SSx) b = 0.30 √(50/25) = .42
a = ̅Y – b ̅X a = 8 – (.42)(13) = 2.54
Y' = bX + a Y' = .42X + 2.54
Assume that an r of .30 describes the relationship between educational level (highest grade completed) and estimated number of hours spent reading each week. More specifically:
Educational Level (X): ̅X = 13 SSx = 25
Weekly Reading Time (Y)
̅Y = 8
SSy = 50
r = .30
Faith’s education level is 15. What is her predicted reading time?
Y’ = .42X + 2.54
Y’ = (.42)(15) + 2.54 = 8.84
Assume that an r of .30 describes the relationship between educational level (highest grade completed) and estimated number of hours spent reading each week. More specifically:
Educational Level (X): ̅X = 13 SSx = 25
Weekly Reading Time (Y)
̅Y = 8
SSy = 50
r = .30
Keegan’s educational level is 11. What is his predicted reading time?
Y’ = .42X + 2.54
Y’ = (.42)(11) + 2.54 = 7.16
Standard Error of Estimate (Definition Formula)
Sy|x = √[Sy|x / (n – 2)]
= √[∑(Y – Y’)² / (n – 2)]
Standard Error of Estimate (Computation Formula)
Sy|x = √[SSy (1 – r²) / (n – 2)]
SSy = ∑Y² – (∑Y)²/n
Standard Error of Estimate (Sy|x)
A rough measure of the average amount of predictive error.
Give the 2 steps for the calculation of the standard error of estimate, Sy|x
- Assign values to SSy and r by referring to previous work with the least squares regression equation.
- Substitute numbers into the formula and solve for Sy|x
Calculate the standard error of estimate assuming that the correlation of .30 is based on n = 35 pairs of observations and supply a rough interpretation of the standard error of estimate.
Educational Level (X): ̅X = 13 SSx = 25
Weekly Reading Time (Y)
̅Y = 8
SSy = 50
r = .30
Sy|x = √[SSy (1 – r²) / (n – 2)]
Sy|x = √[50 (1 – 0.30²) / (35 – 2)] = √[50 (0.91) / 33] = √45.5/33 =√1,38 = 1.17
Roughly indicates the average amount by which the prediction is in error.
Squared Correlation Coefficient (r²)
The proportion of the total variability in one variable that is predictable from its relationship with the other variable.
r² interpretation
r² (computation formula)
r² = SSy’ / SSy = (SSy – Sy|x) / SSy
r² = [SPxy / √(SSxSSy)]²
Assume that an r of .30 describes the relationship between educational level and estimated hours spent reading each week.
According to r², what percent of the variability in weekly reading time is predictable from its relationship with educational level?
9 % predicted
Assume that an r of .30 describes the relationship between educational level and estimated hours spent reading each week.
What percent of variability in weekly reading time is not predictable from this relationship?
91 % not predicted