3 Scatterplots and Regression Flashcards

Question 1

Q

Interpreting Correlation

Answer

A

Correlation is just r. Gives direction and strength. r = 0.7 means there is a strong positive relationship between these variables. NOTE: r should only be used when the data is roughly linear but just knowing r (even if it is high) doesn’t guarantee linearity.

Question 2

Q

Explanatory Variable

Answer

A

x, input, independent variable. Usually, this is what is changed. In an experiment, the treatments are the explanatory. Ex. rubber bands explain distance traveled.

Question 3

Q

Response Variable

Answer

A

y, output, dependent. This is usually what is measured as a result of changes in the explanatory. Ex. We added rubber bands and measured the distance traveled (response variable)

Question 4

Q

Describe a relationship or scatterplot

Answer

A

DUFS (direction, unusual points, form, strength). Usually in one sentence: There is a strong, positive, linear relationship…

Question 5

Q

Scatterplot

Answer

A

Each dot represents 2 variables for one individual. In this graph, the “individuals” are married couples.

Question 6

Q

LSRL

Question 7

Q

Positive Association

Answer

A

As x increases, y increases

As x decreases, y decreases

Positive correlation

Question 8

Q

Negative Association

Answer

A

As x increases, y decreases

As x decreases, y increases

Negative correlation

Question 9

Q

What is Correlation?

Answer

A

“r”
Always between -1 and 1.
As r-value becomes closer to 1 (or -1, the correlation becomes stronger

Question 10

Q

How do you find r²?

Answer

A

usually: It’s on the computer printout.
sometimes: the problem gives you r and you just square it.
The problem gives you the coefficient of determination (that’s just the fancy name of r²)

Question 11

Q

Coefficient of Determination

Answer

A

r²
written as a decimal/percentage
Interpretation: % of the variation/change in y (use context) is explained/accounted for by the LSRL

Question 12

Q

Why is it called a least-squared regression line?

Answer

A

Of all possible lines of fit, it minimizes the sum of all the squared residuals. Remember a residual is the difference between an actual y-value and the predicted y-value. (not x-value)

Question 13

Q

How do you find and interpret a residual?

Answer

A

Actual minus predicted OR observed - expected (y-ŷ)
To get the predicted, you plug the x-value into the LSRL. They have to give you the actual value.
Interpretation: The actual (y-context) for this (specific x-value) was (residual) more/less than predicted.
Ex. The actual distance traveled for Barbie with 5 rubber bands was 1.47 in. more than predicted.

Question 14

Q

How do you find and interpret the slope of the LSRL?

Answer

A

“b” value in ŷ=a+bx.
Interpret the slope:
For each additional (x-context) the predicted (y-context) (increases/decreases) by (slope).
“For each additional mile driven, the predicted sales price of a truck decreases by $15.” OR “We predict that the sales price of a truck will lose $15 for each additional mile driven.”

Question 15

Q

How do you find and interpret the y-intercept of the LSRL?

Answer

A

“a” value in ŷ=a+bx
Interpret the y-intercept:
When (x=0 context) the predicted (y-context) is (y-int).
“A truck with 0 miles on it is predicted to sell for $45,000”

Question 16

Q

What is a residual plot?

Answer

Study These Flashcards

A

A plot representing the x values and residual values (y-ŷ).
It’s like you just take the LSRL and make it horizontal and zoom in a little on all of the differences.

Question 17

Q

How do you tell if a linear model is appropriate?

Answer

Study These Flashcards

A

If given a residual plot: if there’s no pattern in the residual (a curve or all the points on one side are positive but generally negative on the other side) a liner model is appropriate. If there is a clear pattern in the residual plot, a linear model is not appropriate.

Question 18

Q

How do you find and interpret S (the standard deviation of the residuals)?

Answer

Study These Flashcards

A

It has to be given in the problem (usually in the lower left of a computer printout).
Interpretation:
“The actual (y-context) is typically about (s units) away from the number predicted by the LSRL.”
OR “When using the LSRL to predict (y-context) we will typically be off by (s units).”
“The actual sales price of a truck is typically about $1,000 away from the price predicted by the LSRL relating sales price and miles driven” OR “When predicting sales price from miles driven, we will typically be off by $1,000”

Question 19

Q

How do you read a computer printout for LSRL problems?

Answer

Study These Flashcards

A

Question 20

Q

How do you answer a question that asks about how confident you are about a prediction?

Answer

Study These Flashcards

A

First, if the point you are predicting is far from the data, this is extrapolation and we shouldn’t be confident.
Otherwise, if the fit is strong (r/r^2 close to 1) then you can be confident in your predictions.

Question 21

Q

If you switch the x and y (explanatory and response) what would change?

Answer

Study These Flashcards

A

The correlation and r^2 and standard deviation would all stay the same but the LSRL will be completely different. If the slope was positive it will still be positive but it won’t be the same number. The y-int will change too.

Question 22

Q

In regression, what is the difference between an outlier and a high-leverage point?

Answer

Study These Flashcards

A

An outlier has a large residual (above or below the line). A high-leverage point is far away from the data to the right or left.

3 Scatterplots and Regression Flashcards

(22 cards)