Exam Flashcards
What is a classification problem
A problem that requires machine learning algorithms that learn how to assign a class label to examples from the problem domain
What is a regression problem
A problem that learns to predict continuous variables
What algorithms are used for Regression Problems?
- Linear Regression
- Support Vector Regression
- Regression Tree
Give an example of a Classification problem
Getting a machine to classify different images such as the difference between apple[1,0,0], banana[0,1,0] and cherry[0,0,1]
What is Underfitting?
When a model cannot capture underlying trend of the data
Why does Underfitting occur?
Algorithm does not fit/ Not enough data
What happens with the Bias& Variance in Underfitting
High bias and low variance
What is Bias?
Assumptions made by a model to make a function easier to learn
What is Variance?
Training data obtains a low error, and then changing training data obtains a high error
How to prevent Underfitting
Increase model complexity
Increase number of features (feature engineering)
remove noise
Increase epochs
What is overfitting?
Trained with a lot of data, the model starts to learn from the noise and inaccurate data entries. The model has too much freedom and builds an unrealistic model
What is overfitting in terms of variance and bias
High variance and low bias
How to reduce overfitting
Increase training data
reduce model complexity
early stopping
L1&L2 regularization
Dropouts if neural network
What is regularisation
the technique of calibrating machine learning models to minimize the loss and prevent over or underfitting
What noise mean?
The data points in a dataset that don’t really represent the true properties of your data