DECK 4 UNIT 2 REGRESSION STUFF Flashcards

1
Q

association or correlation?

A

association is talking about a relationship. correlation is an actual calculated number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to describe association? scatterplot

A

DIRECTION, FORM, STRENGTH (and strange stuff)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

direction?

A

positive or negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

form?

A

straight, curved or zig zag

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

strength?

A

give the r value (if straight), or say “tightly packed. loosely packed”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

does correlation mean causation?

A

NO WAY DUUUUUDE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

does high r squared mean a good model?

A

not alone, you should check your plot and residuals to make sure model is appropriate and no outliers present? then it means something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

does high r value mean anything?

A

NOT IF IT ISN’T liNEARand there aren’t outliers.. LOOK AT THE DATA.. THEN IT MEANS AN AWFUL LOT»»>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is r calculated?

A

r= sum(ZxZy) / (n-1)—- kind of like the average sized rectangle on the standardized axes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how can you check for “straight enough?”

A

residuals plot fool!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how do you interpret slope?

A

ON AVERAGE, for an increas of 1 [unit of x] there is an (increase/decrease) of [SLOPE] [units of y]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how do you interpret y intercept?

A

if there were no [x stuff] the model predicts you’d have this much [y stuff]. USE UNITS FROM CONTEXT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how to interpret slope EQUATION?

A

rsy/1sx means that for each increase of 1 st dev in x direction, you go r st dev in y direction.SO, think “ r Sy for each 1Sx”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

if you mult or divide the x’s or y’s (shift/scale) does r change?

A

no. the strength remains the same. if you mult or div by negative the sign will change but it will still have same strength.(If you log or square it, it will change, but just adding or multiplying won’t change it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

if you switch x and y does r change?

A

NO. The strength stays the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

if you switch x and y will slope change?

A

YES- slope is rsy/sx. to get new slope you do: (r sqared)/old slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

interpret r squared

A

r squared % of variability in y can be explained by the model (with x stuff). The rest is in residuals.

18
Q

is r sensitive to outliers?

A

yes. A single outlier can make it seem like there is a relationship (out in x direction..)

19
Q

Look for lurking variables?

A

think hot chocolate sales at ski mountain and ski accidents? strong positive relationship, but why are they??? (lurking weather)

20
Q

outliers in regression?

A

doesn’t follow the “flow”

21
Q

what about your calculator for using curves to fit curved data?

A

Quadreg, cubicreg, lnreg, etc. just be careful when substituting while writing the model’s equation.

22
Q

what does “regression to the mean” mean?

A

predictions for y are closer to the mean y (y bar) than the actual x is to the mean x (in s.d)..

23
Q

what does influential mean?

A

It means that the point, when added or removed to data, will influence the SLOPE.. Generally these are outliers in the x direction?. Far left or right.

24
Q

what is a linear model?

A

it is an equation you can use, or a line of a graph, but it is just a model that says what kind of happens, and can be used to ESTIMATE WHAT MIGHT HAPPEN

25
Q

what is a residual?

A

ACTUAL-PREDICTED. A-P. like this class.. AP (get it?)

26
Q

what is b1 and bo ?

A

b1 is the SLOPE, and bo is the intercept. Remember that bo can be thought of as “b old” it is the old b. With y=mx+b, b was intercept…. b old.

27
Q

what is leverage?

A

leverage just means it is far away from x-bar. far right or left. Some leverage points are not influential if they go along with the flow of the scatter.

28
Q

what is the line that you plot?

A

IT IS A MODEL!

29
Q

Will the LSRL go through a lot of points?

A

Usually it hits no points.

30
Q

what is the LSRL

A

the “least squares regression line”

31
Q

what should we look for in resid plot?

A

We want random looking residuals so we look for curve or pattern.. Also, it should have equalish scatter from left to right. IF curved or unequal scatter, then linear model not good.

32
Q

what’s up with extrapolation?

A

not a good idea?? sometimes it’s all you can do, but still, NOT GOOD.

33
Q

what is extrapolation?

A

when you use a linear model to make predictions outside the range of gathered data. (predictions for x values smaller or larger than the x values you collected. way right or way left beyond scatterplot)

34
Q

which is explanatory variable?

A

x. horizontal axis. it “explains” what happens to y. X is EXplanatory.

35
Q

which is response?

A

y.. Vertical axis.. It “responds” to the x

36
Q

will residual plots always show outliers? (will outliers always have large residuals?)

A

No? sometimes the outlier has so much leverage and is so influential that is pulls the LSRL right up to it, so it looks like a small residual.

37
Q

How to handle outliers?

A

Sometimes you can talk about it and run regression with and without the outlier present. Just have a valid reason why you are doing this? Sometimes an outlier is simply a typo.. Or so strange it sways what “typically” happens? Since your model is supposed to represent what sort of happens, tossing it is ok sometimes as long as you discuss it.

38
Q

“there is a correlation between SAT scores and gender” why is this wrong?

A

Correlation is a number calculated from 2 quantitative variables… Gender is categorical. It should say “association”

39
Q

If you add stuff to x or y, or switch the units from cm to inches, will that change r?

A

No, shifting and scaling won’t change r. It is a standardized measurement. If you measure height in inches and your European buddy used cm.. you’ll still have the same r value.

40
Q

If r = 0.8, The prediction for an x value 2 standard deviations above the x mean will be ___ standard deviations above the y mean.

A

1.6 Think… every 1 sd in x direction give prediction that is r sd in y direction. In this case, 1 sd in x means 0.8 in y direction.. so 2 sd above mean x will give a predicted y that is 0.8+0.8= 1.6 sd above mean y in y direction

41
Q

What is homoscedasticity?

A

Equal scatter about a regression line. A good thing.