3 Scatterplots and Regression Flashcards
Interpreting Correlation
Correlation is just r. Gives direction and strength. r = 0.7 means there is a strong positive relationship between these variables. NOTE: r should only be used when the data is roughly linear but just knowing r (even if it is high) doesn’t guarantee linearity.
Explanatory Variable
x, input, independent variable. Usually, this is what is changed. In an experiment, the treatments are the explanatory. Ex. rubber bands explain distance traveled.
Response Variable
y, output, dependent. This is usually what is measured as a result of changes in the explanatory. Ex. We added rubber bands and measured the distance traveled (response variable)
Describe a relationship or scatterplot
DUFS (direction, unusual points, form, strength). Usually in one sentence: There is a strong, positive, linear relationship…
Scatterplot
Each dot represents 2 variables for one individual. In this graph, the “individuals” are married couples.
LSRL
Positive Association
As x increases, y increases
As x decreases, y decreases
Positive correlation
Negative Association
As x increases, y decreases
As x decreases, y increases
Negative correlation
What is Correlation?
“r”
Always between -1 and 1.
As r-value becomes closer to 1 (or -1, the correlation becomes stronger
How do you find r²?
usually: It’s on the computer printout.
sometimes: the problem gives you r and you just square it.
The problem gives you the coefficient of determination (that’s just the fancy name of r²)
Coefficient of Determination
r²
written as a decimal/percentage
Interpretation: % of the variation/change in y (use context) is explained/accounted for by the LSRL
Why is it called a least-squared regression line?
Of all possible lines of fit, it minimizes the sum of all the squared residuals. Remember a residual is the difference between an actual y-value and the predicted y-value. (not x-value)
How do you find and interpret a residual?
Actual minus predicted OR observed - expected (y-ŷ)
To get the predicted, you plug the x-value into the LSRL. They have to give you the actual value.
Interpretation: The actual (y-context) for this (specific x-value) was (residual) more/less than predicted.
Ex. The actual distance traveled for Barbie with 5 rubber bands was 1.47 in. more than predicted.
How do you find and interpret the slope of the LSRL?
“b” value in ŷ=a+bx.
Interpret the slope:
For each additional (x-context) the predicted (y-context) (increases/decreases) by (slope).
“For each additional mile driven, the predicted sales price of a truck decreases by $15.” OR “We predict that the sales price of a truck will lose $15 for each additional mile driven.”
How do you find and interpret the y-intercept of the LSRL?
“a” value in ŷ=a+bx
Interpret the y-intercept:
When (x=0 context) the predicted (y-context) is (y-int).
“A truck with 0 miles on it is predicted to sell for $45,000”