Lecture 4 Flashcards

Question 1

Q

Methods that will help improve the performance of our model, these methods/approaches are divided into three main types

Answer

A

1- Subset Selection ( Select the attributes that we want to keep in the model
2- Second type is shrinkage, reduce the the predicotrs in such a way to reduce variance
3- Reducing the dimension of our data (some kind of transformation)

Question 2

Q

Subset selection

Answer

A

main goal reduce RSS

WE will always have the competing aspects between bias and variance

Question 3

Q

Validation set approach

Answer

A

Randomly dividing the set into training and test

Question 4

Q

LOOCV: Leave one out cut cross validation

Answer

A

Leaves single observation for validation and the remaining for the training set

Question 5

Q

LOOCV has less bias becaause:

Answer

A

1- It is fit with n-1 observations (training set) in comparison to validation set which has approximately half of the observations as training set
2- Validation set approach can yield different results due to randomness in splitting
3- LOOCV doesnt over estimate
4- However i may become computationally expensive as n increases

Question 6

Q

K fold validation:

instead of having one element for validation we use a fold for validation (multiple elements)

Question 7

Q

LOOCV is special case for K fold where k=n

Question 8

Q

Bootstrap

Answer

A

REpresent a way to resmaple your data over and over however with replacement

Question 9

Q

IN bootstrap it is like we are treating the sample as population and sampling over and over

Question 10

Q

Comments in linear model

Answer

A

1- WE are assuming that the relationship between response and predictor is linear

Question 11

Q

Many of the variables used in a multiple regression model may not be associated with the response; this will add unnecessary complexity

Question 12

Q

Our target: We want minimum predicotrs to lead maximum explanations of the respone

Question 13

Q

I want to do feature selection in order to have maximum interpretability of the model

Answer

A

TRUE, thus we do subset selection

Question 14

Q

Subset selection

Answer

A

Keep only subset of the variables in the model

Question 15

Q

Methods that allow us to improve the performance in our model:
There are 3 main types:

Answer

A

1- Subset Selection
2- Shrinkage
3- Dimension Reduction

Question 16

Q

Subset Selection: Main goal is to reduce RSS

Answer

Study These Flashcards

A

True

Lecture 4 Flashcards

(16 cards)