Regularization Flashcards
What is overfitting?
occurs when a model performs well on training data but not on new inputs
What is dropout?
regularization technique that randomly removes nodes and their connections at every iteration, resulting in different sets of outputs and better generalization
What problem does regularization solve in deep learning?
Improving model performance using training data AND new test data as the most common problem in deep learning is overfitting
What are the different approaches to addressing overfitting?
- Dropout
- Augmentation
- Early Stopping
How does the dropout approach work?
At every iteration it randomly selects some nodes and removes them along with their incoming and outgoing connections
key points: each iteration has a different set of nodes and this results in a different set of outputs
Why do dropout models perform better?
Captures more randomness and memorizes less of the training data which will lead to better generalizations and build a more robust predictive model
How can dataset augmentation help train better models?
More data = better models and so by adding fake data you can synthesize more data by applying transformations on the existing dataset to synthesize more data
What classification problem does dataset augmentation help the most with?
Object recognition because images are high dimensional and include an enormous range of factors of variation which many can be simulated
Describe the concept of “early stopping”
Involves stopping training when the error on the validation set starts to increase, leading to a model with better validation and test set error