Lecture 4 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Methods that will help improve the performance of our model, these methods/approaches are divided into three main types

A

1- Subset Selection ( Select the attributes that we want to keep in the model
2- Second type is shrinkage, reduce the the predicotrs in such a way to reduce variance
3- Reducing the dimension of our data (some kind of transformation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Subset selection

A

main goal reduce RSS

WE will always have the competing aspects between bias and variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Validation set approach

A

Randomly dividing the set into training and test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

LOOCV: Leave one out cut cross validation

A

Leaves single observation for validation and the remaining for the training set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

LOOCV has less bias becaause:

A

1- It is fit with n-1 observations (training set) in comparison to validation set which has approximately half of the observations as training set
2- Validation set approach can yield different results due to randomness in splitting
3- LOOCV doesnt over estimate
4- However i may become computationally expensive as n increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

K fold validation:

instead of having one element for validation we use a fold for validation (multiple elements)

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

LOOCV is special case for K fold where k=n

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bootstrap

A

REpresent a way to resmaple your data over and over however with replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

IN bootstrap it is like we are treating the sample as population and sampling over and over

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Comments in linear model

A

1- WE are assuming that the relationship between response and predictor is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Many of the variables used in a multiple regression model may not be associated with the response; this will add unnecessary complexity

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Our target: We want minimum predicotrs to lead maximum explanations of the respone

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

I want to do feature selection in order to have maximum interpretability of the model

A

TRUE, thus we do subset selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Subset selection

A

Keep only subset of the variables in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Methods that allow us to improve the performance in our model:
There are 3 main types:

A

1- Subset Selection
2- Shrinkage
3- Dimension Reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Subset Selection: Main goal is to reduce RSS

A

True