Reading Quiz 3 Flashcards

1
Q

explanatory variable

A

independent
x
can still use if explanatory doesn’t cause response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

response variable

A

dependent

y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

scatterplots analyzed according to

A

direction
form
strength of relationship
outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

direction

A

positive association, negative association, neither

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

form

A

clusters of points, linear pattern, etc

need to say if linear or not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

strength of the relationship

A

how close to a straight line do the points appear to be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

outliers

A

points that don’t follow the general pattern of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

correlation coefficient

A

measures direction and strength of linear relationship between two quantitative variables
r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

notes about correlation coefficient number

A

always -1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does correlation coefficient measure

A

existence and strength of linear relationships

if r=0 not a linear relationship but other relationship could exist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

is formula for correlation coefficient sensitive to outliers

A

extremely sensitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

does correlation coefficient have units

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

does correlation coefficient changed based on explanatory and response variable

A

no

it is the same regardless of which variable you consider to be the explanatory and which you consider to be the response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

extrapolation

A

use of a regression line for predication outside the range of values of the explanatory variable x used to obtain the line
often not accurate
bad

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

least squares regression line

A

line that makes the sum of the squares of the residuals as small as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

sum of the squares of the residuals

A

error sum of the squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

formula for least squares regression line

A
yhat = a + bx
b = slope = r(sy/sx)
a = y-intercept = mean of y — mean of x (slope)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what point is on every regression line

A

(mean of x, mean of y)

19
Q

residual

A

y — yhat
observed value of y minus predicted value of y
positive = above regression line
negative = below regression line

20
Q

coefficient of determination

A

r^2
measures variation in y that is explained by y’s linear association with x
higher means better LSRL fits

21
Q

sentence for coefficient of determination

A

this means that X% of the variation in Y (y) is explained by the linear relationship with X (x)

22
Q

residual plot

A

graphs the residuals on the vertical axis and either explanatory, response, or predicted response values on the horizontal axis

23
Q

residuals from a LSRL always have a mean of

24
Q

how data fits residual plots

A

good if points scattered evenly and closely to horizontal axis, no clear pattern
bad if plot is curved (not linear)
bad if values fan out (outliers, not as accurate on fanned side)

25
outlier
observation that lies outside the overall pattern of the other observations in a scatterplot can be outlier in x, y, or both directions
26
influential
observation is influential if removing it would markedly change the position of the regression line points that are outliers in the x direction are often influential
27
are correlation coefficient (r) and LSRL resistant
no
28
scatterplot
displays relationship between two quantitative variables for the same individual
29
least squares regression line
line of best fit only used when one variable helps explain or predict the other can be used to predict a y value given an x minimizes residual
30
clusters
when describing a scatterplot and the values fall in two or more groups separated by gaps
31
for any given x value the more widely varying the y values are means the relationship is
less strong
32
three other scatterplot guidelines
1. make the intervals uniform 2. label both axes 3. choose a scale that makes graph big enough
33
how to add categorical variable into graph
use different colors or symbols
34
which measures of center and spread do you use with the correlation coefficient
mean and standard deviation
35
true or false: in a regression line you get the same numbers (slopes and intercepts) no matter which variable is considered explanatory and which is considered response
false | change per unit of x v y different than change per unit of y v x
36
for LSRL sum of what squares being minimized?
squares of errors for each data point (residuals)
37
SSE
sum of the squares of the deviations of the actual y values from the predicted y values (residuals)
38
true or false: the slope of the regression line tells how many unstandardized units the predicted value of y changes for each unstandardized unit change in x
true
39
true or false: the correlation coefficient tells how many standard deviations the predicted y changes for each standard deviation change in x
true
40
true or false: if both of two variables x and y are standardized so the the sd of both is 1 then the slope of the regression line and the correlation coefficient are equal
true
41
true or false: LSRL is line that minimizes square of residuals
true
42
why can't all values on residual plot be positive
mean of least squares residuals is zero so if have positive must have negative too
43
does influential point necessarily have large residual
no