4 Flashcards
Which LR training method can you use for sets with millions of features
Stochastic or Mini Batch
Which Training algorithms suffer from poor scaling
Gradient Descent will be slow
Can GD get stuck in local minima
Yes - Stochastic may help has it could bounce out
Do all GD algorithms end up in the same place
Yes - however stochastic and mini batch will never reach optima
When using gradient descent - if the validation error starts to rise whats going on and how can it be fixeD
The model is over fitting and training should be stopped
Should mini-batch GD be stopped when the validation error goes up
No - it may just be a random occurrence it could continue to fall
Which GD algorithm will reach a good solution the fastest
Stochastic however it will never converge perfectly
If there is a large gab between training and validation errors for polynomial regression whats happening and how can it be solved
Over fitting
Can be fixed by reducing polynomial degree, regularize the model, more training data
If training and validation error are equal for ridge regression and both high does the model suffer from high bias or variance- should you increase the regularization hyper parameter or reduce it
Under fitting meaning a high bias. Try reduce the regularization hyper parameter
Why would you want to use:
Ridge Regression instead of linear
Lasso Instead of Ridge
Elastic net instead of Lasso
Regularized models generally perform better
Automatically performs feature selection
Lasso may be erratic - Elastics with a l1 ratio close to one stops this
To classify images as indoor or outdoor and day or night should you implement two logistic regression classifiers or one Softmax Regression Classifier
These are not exclusive classes (could be multiple) so two logistic Regression is suitable