Linear Regression Flashcards

1
Q

Potential Problem with Linear Regression

A
  1. Non Linearity 2.Correlation of Error terms 3.Non Constant variance 4.Outliers, 5.High-Leverage 6. Colinearity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Residual Plots in Linear Regression

A

(1) if there is a shape of the residual that is non-linear, suggests a non linear relationship in the data. (2) Also, can you residual plots to detect hetroscedasticity in the dataset, which will look like a funnel (3) Also look for tracking in the residuals, which may occur if error terms are correlated - think time series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

2 assumptions of linear model

A

(1) Additive and (2) Linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Outliers in Linear Regression

A

Can skew the model, may want to remove these. Also there is a statistic called stdentized residual, for which a value greater than 3 suggests possible outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Levarge in Linear Regression

A

When there is an extreme X value, can compute a leverage statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Will Correlation Matrix Detect Colinearity

A

Yes, but not all cases. It is possible for combinations of vars to have colinearity. There is a statistic called the Variance Inflation Factor VIF. VIF > 5 indicates colinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variance Inflation Factor

A

VIF > 5 indicates colinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Studentized Residual

A

helps detect outliers in Y, value > 3 suggests outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Leverage Statistic

A

If exceeds (p + 1) / n then the corresponding point has high leverage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Possible Solutions for colinearity

A

(1) drop one of the vars (2) combine the two, like take the average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bias vs Variance Tradeoff

A

Variance - tendency to overfit. Bias - Accuracy of Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Parametric vs. Non Paramteric Model Accuracy

A

Parametric will tend to outpeform NP when there is small N/P because of high dimensionality. Think about what high dimensionality does to KNN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When order doesn’t matter permutations or combinations? Formula for each

A
Order doesnt' matter = combination
Combination:
P choose K = P! / K! (P-K)!
Permutation:
P choose K = P! / (P-K)!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Confusion Matrix

A
Y axis (left): Predicted Status
X axis (top): Actual or True Status
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Sensitivity vs Specificity

A

Sensitivity: % of true positives caught.
Specificty: % of non-positives caught.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ROC Curve

A

Y axis: True Positives, (Sensitivity)
X axis: False Positives (1 - Specificity)

Want to catch as many True positives without any false positives. Want to catch lots of fish but no dolphins.
There is a diagonal line, this reference assigning classification by chance. Ex) randomly select 20% of population as guilty, then you should caught identified 20% of the True Positives by chance, and out of the remaining population 20% would have been identified as guilty by chance also.
Want to maximize the area under the ROC curve

17
Q

Compare the following classification methods:

  1. Logistic Regression
  2. LDA
  3. QDA
  4. KNN
A

LR and LDA assume linear decision boundaries, LDA assumes that observations are drawn from normal distribution with common co-variance matrix in each class, so it will outpeform LR if this is the case, otherwise LR will win.

QDA is a quadratic decision boundary, but is still parametric. More flexible decision boundary possible. Since its parametric, preforms well when N is small. This has the opposite assumption than LDA where you are looking for different correlations b/w predictors in each class. Or, each class has a different co-variance matrix. Also, this is a tradeoff b/w being linear and KNN - “moderately non linear”.

KNN is completely non-parametric, can find extremely non linear boundaries.

18
Q

Curse of Dimensionality

A

When p is large, there tends to be a deterioration in the performance of KNN and other local approaches that perform prediction using only observations near the test observation. Non-parametric approaches perform poorly when P is large. Also, with local approaches, when there is a large number of dimensions you are using a very small fraction of the available observations to make the prediction.