DECK 4 UNIT 2 REGRESSION STUFF Flashcards
association or correlation?
association is talking about a relationship. correlation is an actual calculated number.
How to describe association? scatterplot
DIRECTION, FORM, STRENGTH (and strange stuff)
direction?
positive or negative
form?
straight, curved or zig zag
strength?
give the r value (if straight), or say “tightly packed. loosely packed”
does correlation mean causation?
NO WAY DUUUUUDE
does high r squared mean a good model?
not alone, you should check your plot and residuals to make sure model is appropriate and no outliers present? then it means something
does high r value mean anything?
NOT IF IT ISN’T liNEARand there aren’t outliers.. LOOK AT THE DATA.. THEN IT MEANS AN AWFUL LOT»»>
How is r calculated?
r= sum(ZxZy) / (n-1)—- kind of like the average sized rectangle on the standardized axes
how can you check for “straight enough?”
residuals plot fool!
how do you interpret slope?
ON AVERAGE, for an increas of 1 [unit of x] there is an (increase/decrease) of [SLOPE] [units of y]
how do you interpret y intercept?
if there were no [x stuff] the model predicts you’d have this much [y stuff]. USE UNITS FROM CONTEXT.
how to interpret slope EQUATION?
rsy/1sx means that for each increase of 1 st dev in x direction, you go r st dev in y direction.SO, think “ r Sy for each 1Sx”
if you mult or divide the x’s or y’s (shift/scale) does r change?
no. the strength remains the same. if you mult or div by negative the sign will change but it will still have same strength.(If you log or square it, it will change, but just adding or multiplying won’t change it)
if you switch x and y does r change?
NO. The strength stays the same.
if you switch x and y will slope change?
YES- slope is rsy/sx. to get new slope you do: (r sqared)/old slope
interpret r squared
r squared % of variability in y can be explained by the model (with x stuff). The rest is in residuals.
is r sensitive to outliers?
yes. A single outlier can make it seem like there is a relationship (out in x direction..)
Look for lurking variables?
think hot chocolate sales at ski mountain and ski accidents? strong positive relationship, but why are they??? (lurking weather)
outliers in regression?
doesn’t follow the “flow”
what about your calculator for using curves to fit curved data?
Quadreg, cubicreg, lnreg, etc. just be careful when substituting while writing the model’s equation.
what does “regression to the mean” mean?
predictions for y are closer to the mean y (y bar) than the actual x is to the mean x (in s.d)..
what does influential mean?
It means that the point, when added or removed to data, will influence the SLOPE.. Generally these are outliers in the x direction?. Far left or right.
what is a linear model?
it is an equation you can use, or a line of a graph, but it is just a model that says what kind of happens, and can be used to ESTIMATE WHAT MIGHT HAPPEN