3b Scatterplots and Regression Flashcards
LSRL
How do you find r²?
usually: It’s on the computer printout.
sometimes: the problem gives you r and you just square it.
The problem gives you the coefficient of determination (that’s just the fancy name of r²)
Coefficient of Determination
r²
written as a decimal/percentage
Interpretation: % of the variation/change in y (use context) is explained/accounted for by the LSRL
Why is it called a least-squared regression line?
Of all possible lines of fit, it minimizes the sum of all the squared residuals. Remember a residual is the difference between an actual y-value and the predicted y-value. (not x-value)
How do you find and interpret S (the standard deviation of the residuals)?
It has to be given in the problem (usually in the lower left of a computer printout).
Interpretation:
“The actual (y-context) is typically about (s units) away from the number predicted by the LSRL.”
OR “When using the LSRL to predict (y-context) we will typically be off by (s units).”
“The actual sales price of a truck is typically about $1,000 away from the price predicted by the LSRL relating sales price and miles driven” OR “When predicting sales price from miles driven, we will typically be off by $1,000”
How do you read a computer printout for LSRL problems?
How do you answer a question that asks about how confident you are about a prediction?
First, if the point you are predicting is far from the data, this is extrapolation and we shouldn’t be confident.
Otherwise, if the fit is strong (r/r^2 close to 1) then you can be confident in your predictions.
If you switch the x and y (explanatory and response) what would change?
The correlation and r^2 and standard deviation would all stay the same but the LSRL will be completely different. If the slope was positive it will still be positive but it won’t be the same number. The y-int will change too.
In regression, what is the difference between an outlier and a high-leverage point?
An outlier has a large residual (above or below the line). A high-leverage point is far away from the data to the right or left.