17. Regression Flashcards

1
Q

DIFFS BETWEEN corr/coeff

A

Regression is ASYMMETRIC - predicting one variable from one another

Cor. is symmetric

Regression uses relationship to predict Y from X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

X/Y in regression

A

X = explanatory

Y = response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

alpha and beta

A

alpha = y intercept

beta = slope

estimates = a and b!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Y hat

A

value of Y predicted by value of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Residuals

A

Yi - Yi(hat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you decide which line is the BEST fit?

A

Least squares regression line!!!

Find a line that minimizes the sum of squares of residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SSresiduals

A

sum from i = 1 to n of [Yi - Yi(hat)]^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What difference in variation is worrying?

A

more than a 10 fold difference in variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why shouldn’t you fit a polynomial w/ too many terms?

A

wouldn’t predict new data points!!

sample size at least 7 times number of terms!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Confidence bands

A

telling us uncertainty for predicting an average value of y for a given value of x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

which is broader: prediction of where a line is or prediction of an individual?

A

individual - more uncertainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Prediction interval

A

uncertainty in estimating an individual value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

3 methods of fitting non linear relationships

A
  1. transformations on data (ex log of both sides)
  2. Quadratic regression
  3. Splines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Assumptions of Linear Regression + how to see on residual plot

A
  1. random sample - cant see :(
  2. Y normally distrib for all values of X - centered around line, tapers off
  3. Y approx equal variance for all values of X - dots extend to similar points around the line
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

problem with overfitting data

A

very poor predictive power for any new data points!!!!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly