UNIT 2 2017 Flashcards
is r sensitive to outliers?
yes. A single outlier can make it seem like there is a relationship ( if way out in x direction), or even seem like there is no relationship.
<p>association or correlation?</p>
<p>association is talking about a relationship. correlation is an actual calculated number</p>
how do you interpret y intercept?
The model predicts that if there were no [x stuff] this is how much [y stuff] you’d have
How do you undo an ln (natural log) when solving?
e^stuff
Look for lurking variables?
think hot chocolate sales in caf at wachusett mountain and ski accidents at wachusett mountain. Did the chocolate cause the accident??????
how can you check for “straight enough?”
residuals plot fool!
what’s up with extrapolation?
not a good idea. sometimes it’s all you can do, but still, NOT GOOD
if you switch x and y will slope change?
YES- slope is rsy/sx , to get new slope you do: (r sqared)/old slope
how do you describe form of a scatterplot?
straight, curved?
what is a linear model?
it is an equation you can use, or a line of a graph, but it is just a model that says what kind of happens, and can be used to ESTIMATE WHAT MIGHT HAPPEN
what is the line that you plot?
IT IS A MODEL! It is the lsrl and it is the model we are talking about
what is the LSRL
the “least squares regression line” that line you plot OR .. That equation
Can you predict an X by using a Y?
NOT WITH THE SAME EQUATION! BE CAREFUL!! You have to change the entire equation and start from scratch
outliers in regression?
doesn’t follow the “flow” (pinky trick, cover with you pinky.. Then uncover.. Does it follow the flow?)
How do you undo sqrt when solving?
^2 (square both sides)
What values can r be?
from -1 to +1
which is response?
y variable, the Vertical axis.. It “responds” to the x
How do you undo squares or cubes?
^ 1/2 or ^ 1/3 (raise to these powers)
does correlation mean causation?
NO WAY DUDE
How is r calculated?
r= sum(ZxZy) / (n-1) it is the sum of rectangle areas on standardized axes
How do you undo a log when solving?
10^ stuff
what is b1 and bo ?
b1 is the SLOPE, and bo is the intercept.
does high r squared mean a good model?
CHECK STRAIGHNESS FIRST. you should check your plot and residuals to make sure model is appropriate and no outliers present? then it means something
If something is correlated is it associated?
Yes
if something is associatied is it correlated?
Not necessarily. It can be associated and have a zero correlation (thin parabolic scatterplot)
what should we look for in resid plot?
curve or pattern. Also, it should have equalish scatter from left to right
if you switch x and y does r change?
NO. The strength stays the same.
What are some strong r values and some weak r values
Strong r values are close to 1 or -1, like -0.83 or 0.94. Weak r values are close to zero like 0.10 or -0.06
direction?
positive or negative
will residual plots always show outliers? (will outliers always have large residuals?)
Not necessarily. Some points have so much leverage, they pull the line up to it?
How do you get equation from computer output?
y= b0 + b1 x
under column called COEFFICIENT
y is the dependent variable
b0 is the coefficient of constant (or it says intercept)
b1 is the coefficient of the variable given
x is the indep variable
generally arranged: Y= this (down) plus this times (left) this
describe a scatterplot’s strength?
give the r value (if straight), or say “tightly packed” or “ loosely packed”
how to interpret slope EQUATION?
for each increase of 1 st dev in x direction, you go r st dev in y direction.
how do you interpret slope?
for an increas of 1 [unit of x] there is an (increase/decrease) of [SLOPE] [units of y]
What does r tell us?
How STRAIGHT a positive or negative relationship is between two QUANTITATIVE variables (when linear). An r value might be near zero even though there is a strong relationship, like if you try to fit a line to a curve. BUT if you fit a curve to a curve, then the r value tells you how well the scatter fits the curve.
what is leverage?
leverage just means it is far away from x-bar, far right or left from the middle.. Some leverage points are not influential if they go along with the flow of the scatter.
What if a scatterplot goes straight across horizontally?
NO ASSOC. That would be like height and IQ, they are independent so each height has about the same IQ.
Why is it calle d “least squares regression line?”
Because, after you find the mean-mean point, you fix the line so that it minimizes the squared vertical distance to that line (minimizes the squared residuals). Could be called the Least Squared Residuals Line
What if the scatterplot is curved?
either straighten it by doing stuff to y, and then x and fitting a line, or keep it curved and fit a curve (quadreg, cubicreg, lnreg, logreg, pwrreg)
if you mult or divide the x’s or y’s (shift/scale) does r change?
no. the strength remains the same. (If you log or square it, it will change, but just adding or multiplying won’t change it)
What is homoscedasticity?
equal scatter along the regression line
what does “regression to the mean” mean?
preditions for y are closer to the mean y (y bar) than the actual x is to the mean x (in s.d). Sons were closer to average height than the dads. Super tall dads had tall sons, but not super tall sons, on average.
what does influential mean?
Point influences the SLOPE. It means that the point, when added or removed to data, will influence the SLOPE. Generally these are outliers in the x direction. Far left or right.
Give example of incorrectly using the word “correlation”
“there is a correlation between gender and video game playing” This person should say “association.” You can’t say correlation because gender is categorical.
does high r value mean anything?
it can, and usually does, however an r value alone tells little, CHECK THE SCATTER. IS IT LINEAR? make sure it’s linear first
If r= 0.8.. An x value that is 2 standard deviations above the mean will have a predicted y value that is _______
1.6 standard deviations above the mean in the Y direction
How to describe association? In scatterplot
DIRECTION FORM STRENGTH and STRANGE
what about your calculator for using curves to fit curved data?
sure. Quadreg, cubicreg, lnreg, etc. just be careful when substituting while writing the equation given. The explanatory variable goes into all of the x spots
which is explanatory variable?
the x, the horizontal axis. it “explains” what happens to y
What point is on every regression line?
the mean-mean point. (x bar, y bar). This point is generally not one of the points on the scatterplot
Does the regression line (lsrl) go through a lot of points?
No, usually it goes through NONE! It just goes through the center of the cloud of points.
How can you straighten data?
Do stuff to the y (square it, root it, log it, etc) and recheck the plot. Remember to put the transformation into your equation. Example Sqrt y = 4.33 - 2.03 x
What do we look for in a residuals plot?
To proceed, it should look random, if there is a pattern, then find a new model or proceed with caution.
How do you make a residuals plot? (find RESID?)
stat>plot make a scatterplot, but instead of L1 vs L2, change L2 by puttin cursor on it and going to 2nd>lists down to RESID
interpret r squared
r squared % of variability in y can be explained by the model. The rest is in residuals
what is a residual?
ACTUAL-PREDICTED, A-P, like this class AP (get it?) Take y data found and from that, subtract the y you get from plugging the x into the model.