Reading Quiz 3 Flashcards
explanatory variable
independent
x
can still use if explanatory doesn’t cause response
response variable
dependent
y
scatterplots analyzed according to
direction
form
strength of relationship
outliers
direction
positive association, negative association, neither
form
clusters of points, linear pattern, etc
need to say if linear or not
strength of the relationship
how close to a straight line do the points appear to be
outliers
points that don’t follow the general pattern of the data
correlation coefficient
measures direction and strength of linear relationship between two quantitative variables
r
notes about correlation coefficient number
always -1
what does correlation coefficient measure
existence and strength of linear relationships
if r=0 not a linear relationship but other relationship could exist
is formula for correlation coefficient sensitive to outliers
extremely sensitive
does correlation coefficient have units
no
does correlation coefficient changed based on explanatory and response variable
no
it is the same regardless of which variable you consider to be the explanatory and which you consider to be the response
extrapolation
use of a regression line for predication outside the range of values of the explanatory variable x used to obtain the line
often not accurate
bad
least squares regression line
line that makes the sum of the squares of the residuals as small as possible
sum of the squares of the residuals
error sum of the squares
formula for least squares regression line
yhat = a + bx b = slope = r(sy/sx) a = y-intercept = mean of y — mean of x (slope)
what point is on every regression line
(mean of x, mean of y)
residual
y — yhat
observed value of y minus predicted value of y
positive = above regression line
negative = below regression line
coefficient of determination
r^2
measures variation in y that is explained by y’s linear association with x
higher means better LSRL fits
sentence for coefficient of determination
this means that X% of the variation in Y (y) is explained by the linear relationship with X (x)
residual plot
graphs the residuals on the vertical axis and either explanatory, response, or predicted response values on the horizontal axis
residuals from a LSRL always have a mean of
0
how data fits residual plots
good if points scattered evenly and closely to horizontal axis, no clear pattern
bad if plot is curved (not linear)
bad if values fan out (outliers, not as accurate on fanned side)