4 Flashcards

Question 1

Q

Which LR training method can you use for sets with millions of features

Answer

A

Stochastic or Mini Batch

Question 2

Q

Which Training algorithms suffer from poor scaling

Answer

A

Gradient Descent will be slow

Question 3

Q

Can GD get stuck in local minima

Answer

A

Yes - Stochastic may help has it could bounce out

Question 4

Q

Do all GD algorithms end up in the same place

Answer

A

Yes - however stochastic and mini batch will never reach optima

Question 5

Q

When using gradient descent - if the validation error starts to rise whats going on and how can it be fixeD

Answer

A

The model is over fitting and training should be stopped

Question 6

Q

Should mini-batch GD be stopped when the validation error goes up

Answer

A

No - it may just be a random occurrence it could continue to fall

Question 7

Q

Which GD algorithm will reach a good solution the fastest

Answer

A

Stochastic however it will never converge perfectly

Question 8

Q

If there is a large gab between training and validation errors for polynomial regression whats happening and how can it be solved

Answer

A

Over fitting

Can be fixed by reducing polynomial degree, regularize the model, more training data

Question 9

Q

If training and validation error are equal for ridge regression and both high does the model suffer from high bias or variance- should you increase the regularization hyper parameter or reduce it

Answer

A

Under fitting meaning a high bias. Try reduce the regularization hyper parameter

Question 10

Q

Why would you want to use:
Ridge Regression instead of linear
Lasso Instead of Ridge
Elastic net instead of Lasso

Answer

A

Regularized models generally perform better

Automatically performs feature selection

Lasso may be erratic - Elastics with a l1 ratio close to one stops this

Question 11

Q

To classify images as indoor or outdoor and day or night should you implement two logistic regression classifiers or one Softmax Regression Classifier

Answer

A

These are not exclusive classes (could be multiple) so two logistic Regression is suitable

4 Flashcards

(11 cards)