Lecture 9 Flashcards
What is the Regression equation used to for?
Used to describe the linear relationship between X and Y
What is the one assumption of Regression?
- X and Y are linearly related.
The assumptions of Regression are NOT the same as ANOVA.
Give the equation for Y’ (Yprime - i.e. Y predict)
Y’ = a + bx
Which of X or Y is the dependent variable? Which is the independent variable?
Y is always the dependent variable.
X is always the independent variable.
What if you have two dependent variables?
The score that you are trying to predict goes on the Y axis.
Explain the “least squared line”
The method for determining a and b makes Σe² (i.e. the deviations) as small as possible. Therefore often called the “least squared line”.
Give the equation for the regression line, and explain each component part.
Y' = a + bx where: Y' = Ypredict a = intercept (i.e. the value of Y when X = zero - a = Ybar - bXbar b = Regression coefficient - the slope of the least squae line. b = σxy/σ²x = Covariance of x and y/variance of x --> = Σ[(x -Xbar)(y - Ybar)]/n-1 / Σ(x - Xbar)²/n-1 (the n-1's cancel each other out) so we will end up with the slope of the line.
Give more detail for b (the slope).
The numerator is telling us how much X and Y co vary, or the degree of statistical association.
The denominator adjusts the covariance (i.e. the rate of change information) so that it corresponds to a unit increase in X.
i.e. the slope indicates how many units Y increases for every one unit increase in X.
How are ANOVA and Regression similar?
They are mathematically identical. Compare and contrast in notes.
How do we calculate SS for a regression?
Conceptually, as in ANOVA, we sum the squared deviations.
- using computational formulas which don’t actually involve calculating the deviations. Calculating SS in regression is easier to track, if we exploit our knowledge of the Pearson r.
How do we find SSreg?
r²xy = variability in y attributable to x/total variability in y r²xy = Σ(Y' - Ybar)²/Σ(Y-Ybar)² r²xy = SSreg/SSy
with algebra, we get:
SSreg = (r²xy)(SSY)
Explain SSresidual i.e. SSres
If r²xy = variability in y attributable to x, then 1 - r²xy must = variability in y NOT attributable to x. i.e. residual variability
Therefore SSres = (1 - r²xy)(SSy)
This is the computational table.
Explain the F table as it relates to Regression.
Once we calculate the degrees of freedom, we can use the F ratio to determine if the statistical association is due to chance.
dfreg = 1
dfres = n - 2
Describe the equation for the F ratio.
F = (SSreg/dfreg)/(SSres/dfres) = (r²xy)(SSy)/dfreg/(1 - r²xy)(SSy)/dfres –> SSy in numerator and denominator cancel each other out.
important:
F = r²xy/dfreg/(1 - r²xy)/dfres
Discuss “Test Significance of Slope”
- Could be that the slope (b) we estimate from our data is just due to chance.
- The significance test for slope is the F ratio, or the F test.