UNIT 2 2017 Flashcards

1
Q

is r sensitive to outliers?

A

yes. A single outlier can make it seem like there is a relationship ( if way out in x direction), or even seem like there is no relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

<p>association or correlation?</p>

A

<p>association is talking about a relationship. correlation is an actual calculated number</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do you interpret y intercept?

A

The model predicts that if there were no [x stuff] this is how much [y stuff] you’d have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you undo an ln (natural log) when solving?

A

e^stuff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Look for lurking variables?

A

think hot chocolate sales in caf at wachusett mountain and ski accidents at wachusett mountain. Did the chocolate cause the accident??????

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how can you check for “straight enough?”

A

residuals plot fool!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what’s up with extrapolation?

A

not a good idea. sometimes it’s all you can do, but still, NOT GOOD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

if you switch x and y will slope change?

A

YES- slope is rsy/sx , to get new slope you do: (r sqared)/old slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how do you describe form of a scatterplot?

A

straight, curved?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is a linear model?

A

it is an equation you can use, or a line of a graph, but it is just a model that says what kind of happens, and can be used to ESTIMATE WHAT MIGHT HAPPEN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the line that you plot?

A

IT IS A MODEL! It is the lsrl and it is the model we are talking about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the LSRL

A

the “least squares regression line” that line you plot OR .. That equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Can you predict an X by using a Y?

A

NOT WITH THE SAME EQUATION! BE CAREFUL!! You have to change the entire equation and start from scratch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

outliers in regression?

A

doesn’t follow the “flow” (pinky trick, cover with you pinky.. Then uncover.. Does it follow the flow?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you undo sqrt when solving?

A

^2 (square both sides)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What values can r be?

A

from -1 to +1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

which is response?

A

y variable, the Vertical axis.. It “responds” to the x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you undo squares or cubes?

A

^ 1/2 or ^ 1/3 (raise to these powers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

does correlation mean causation?

A

NO WAY DUDE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How is r calculated?

A

r= sum(ZxZy) / (n-1) it is the sum of rectangle areas on standardized axes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you undo a log when solving?

A

10^ stuff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is b1 and bo ?

A

b1 is the SLOPE, and bo is the intercept.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

does high r squared mean a good model?

A

CHECK STRAIGHNESS FIRST. you should check your plot and residuals to make sure model is appropriate and no outliers present? then it means something

24
Q

If something is correlated is it associated?

A

Yes

25
Q

if something is associatied is it correlated?

A

Not necessarily. It can be associated and have a zero correlation (thin parabolic scatterplot)

26
Q

what should we look for in resid plot?

A

curve or pattern. Also, it should have equalish scatter from left to right

27
Q

if you switch x and y does r change?

A

NO. The strength stays the same.

28
Q

What are some strong r values and some weak r values

A

Strong r values are close to 1 or -1, like -0.83 or 0.94. Weak r values are close to zero like 0.10 or -0.06

29
Q

direction?

A

positive or negative

30
Q

will residual plots always show outliers? (will outliers always have large residuals?)

A

Not necessarily. Some points have so much leverage, they pull the line up to it?

31
Q

How do you get equation from computer output?

A

y= b0 + b1 x
under column called COEFFICIENT
y is the dependent variable
b0 is the coefficient of constant (or it says intercept)
b1 is the coefficient of the variable given
x is the indep variable
generally arranged: Y= this (down) plus this times (left) this

32
Q

describe a scatterplot’s strength?

A

give the r value (if straight), or say “tightly packed” or “ loosely packed”

33
Q

how to interpret slope EQUATION?

A

for each increase of 1 st dev in x direction, you go r st dev in y direction.

34
Q

how do you interpret slope?

A

for an increas of 1 [unit of x] there is an (increase/decrease) of [SLOPE] [units of y]

35
Q

What does r tell us?

A

How STRAIGHT a positive or negative relationship is between two QUANTITATIVE variables (when linear). An r value might be near zero even though there is a strong relationship, like if you try to fit a line to a curve. BUT if you fit a curve to a curve, then the r value tells you how well the scatter fits the curve.

36
Q

what is leverage?

A

leverage just means it is far away from x-bar, far right or left from the middle.. Some leverage points are not influential if they go along with the flow of the scatter.

37
Q

What if a scatterplot goes straight across horizontally?

A

NO ASSOC. That would be like height and IQ, they are independent so each height has about the same IQ.

38
Q

Why is it calle d “least squares regression line?”

A

Because, after you find the mean-mean point, you fix the line so that it minimizes the squared vertical distance to that line (minimizes the squared residuals). Could be called the Least Squared Residuals Line

39
Q

What if the scatterplot is curved?

A

either straighten it by doing stuff to y, and then x and fitting a line, or keep it curved and fit a curve (quadreg, cubicreg, lnreg, logreg, pwrreg)

40
Q

if you mult or divide the x’s or y’s (shift/scale) does r change?

A

no. the strength remains the same. (If you log or square it, it will change, but just adding or multiplying won’t change it)

41
Q

What is homoscedasticity?

A

equal scatter along the regression line

42
Q

what does “regression to the mean” mean?

A

preditions for y are closer to the mean y (y bar) than the actual x is to the mean x (in s.d). Sons were closer to average height than the dads. Super tall dads had tall sons, but not super tall sons, on average.

43
Q

what does influential mean?

A

Point influences the SLOPE. It means that the point, when added or removed to data, will influence the SLOPE. Generally these are outliers in the x direction. Far left or right.

44
Q

Give example of incorrectly using the word “correlation”

A

“there is a correlation between gender and video game playing” This person should say “association.” You can’t say correlation because gender is categorical.

45
Q

does high r value mean anything?

A

it can, and usually does, however an r value alone tells little, CHECK THE SCATTER. IS IT LINEAR? make sure it’s linear first

46
Q

If r= 0.8.. An x value that is 2 standard deviations above the mean will have a predicted y value that is _______

A

1.6 standard deviations above the mean in the Y direction

47
Q

How to describe association? In scatterplot

A

DIRECTION FORM STRENGTH and STRANGE

48
Q

what about your calculator for using curves to fit curved data?

A

sure. Quadreg, cubicreg, lnreg, etc. just be careful when substituting while writing the equation given. The explanatory variable goes into all of the x spots

49
Q

which is explanatory variable?

A

the x, the horizontal axis. it “explains” what happens to y

50
Q

What point is on every regression line?

A

the mean-mean point. (x bar, y bar). This point is generally not one of the points on the scatterplot

51
Q

Does the regression line (lsrl) go through a lot of points?

A

No, usually it goes through NONE! It just goes through the center of the cloud of points.

52
Q

How can you straighten data?

A

Do stuff to the y (square it, root it, log it, etc) and recheck the plot. Remember to put the transformation into your equation. Example Sqrt y = 4.33 - 2.03 x

53
Q

What do we look for in a residuals plot?

A

To proceed, it should look random, if there is a pattern, then find a new model or proceed with caution.

54
Q

How do you make a residuals plot? (find RESID?)

A

stat>plot make a scatterplot, but instead of L1 vs L2, change L2 by puttin cursor on it and going to 2nd>lists down to RESID

55
Q

interpret r squared

A

r squared % of variability in y can be explained by the model. The rest is in residuals

56
Q

what is a residual?

A

ACTUAL-PREDICTED, A-P, like this class AP (get it?) Take y data found and from that, subtract the y you get from plugging the x into the model.