Chapter 3 - Scatterplots and Correlation Flashcards

0
Q

-may help explain or predict changes in a response variable;
-goes on the x axis of a graph
ex- car weight and number of cigarettes smoked

A

explanatory variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

-measures an outcome of a study;
-goes on the y variable of a graph
ex- accident death rates and life expectancies

A

response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

-shows relationship between two quantitative variables measured on the same individuals;
ex- percent of students taking the sat & the mean math score

A

scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how to describe a scatter plot

A

Identify DCFS

direction
correlation
form
strength

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

when above average values of one variable tend to accompany above average values of the other and when below average values occur together

A

positive association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when above average values of one tend to accompany below average values of the other

A

negative association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

measures the direction and strength of the linear relationship between two quantitative variables

A

correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

value between -1 and 1

A

correlation (r)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

r>0

A

positive association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

r<0

A

negative association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

interpreting correlation

A
  • correlation makes no distinction between explanatory and response variables
  • r does not change when we change the unit of measurement of x, y, or both
  • correlation does not equal causation
  • both variables need to be quantitative
  • correlation is not resistant: r is affected by outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

determined by how close the points in the scatter plot lie

A

strength

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  • points show a straight line pattern

- watch out for curved relationships and clusters

A

form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  • summarizes the relationship between two variables when one of the variables helps explain or predict the other
  • requires explanatory and response variable
  • describes how variable y changes as variable x changes
  • used to predict value of y for a given value of x
A

regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

regression line

A

y hat = a+bx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

in y hat=a+bx

y hat is the ____

A

predicted value of the response variable y for a given value of the explanatory variable x

16
Q

in y hat=a+bx

b represents ____

17
Q

in y hat=a+bx

a is the ____

A

y intercept;

predicted value of y when x=0

18
Q

coefficient of x is always the ____, no matter what symbol is used

19
Q

use of regression line for prediction far outside the interval of values of the explanatory variable to obtain the line

A

extrapolation

20
Q

difference between observed value of the response variable and the value predicted by the regression line;
residual = observed y - predicted y
=y - y hat

21
Q

if after adding up the predictions, the positive and negative residuals cancel out you should..

A

square the residuals

22
Q
  • y on x

- line that makes the sum of the squared residuals as small as possible

A

least squares regression line

23
Q

the mean of the least squares residuals is always ____

24
- scatter plot of the residuals against the explanatory variable - help us assess whether a linear model is appropriate
residual plot
25
turns the regression line horizontal
residual plot
26
find form of a residual plot
form of residual plot = form of association - form of model
27
- gives the approximate size of a typical prediction error (residual) - use least squares regression line to predict the values of a response variable y from an explanatory variable x
standard deviation of the residuals (s)
28
- coefficient of determination - predicts values of the response variable y - fraction of the variation in the values of y that is accounted for by the least squares regression line of y on x
r^2
29
relationship between the standard deviation of the residuals (s) and the coefficient of determination r^2
- both calculated from sum of squares residuals | - assess how well the line fits the data
30
how to calculate least squares regression line
-calculate means x and y and the standard deviations sx and sy and their correlation r b=r(sy/sx)
31
when the correlation isn't r=1 or -1, the predicted value of y is closer to It's mean y bar than the value of x is to It's mean x bar
regression to the mean; | values of y "regress" to their mean
32
least squares regression lines are not ____
resistant
33
points that are outliers in the __ direction but not the __ direction of a scatter plot have large residuals
y, x
34
an observation is ____ for a statistical calculation of removed it would change the result of the calculation
influential
35
how to verify that a point is influential
find the regression line both with and without questionable point