Final Exam Prep Flashcards
What is density estimation?
Making an estimate by trying to fit a probability density function to some data.
What is occam’s razor?
The simplest explanation is usually right.
Overfitting will give a low training set error but a high test set error. True or false?
True.
Underfitting will give a high training set error and a high test set error. True or False?
True.
Does a underfitted model have high or low bias?
High bias.
What is generalization?
Generalization is the ability of a trained model to classify new data.
What is overfitting?
Overfitting is when the model fits the data too closely and doesn’t generalize to new data. It also captures noise.
Avoid overfitting by cross-validation, early stopping, pruning - further training may hinder generalisation.
What is underfitting?
Underfitting is when the model doesn’t fit the data. It doesn’t capture the underlying pattern.
What is regularlisation?
- A technique used to penalise overfitting.
- Applies a penalty to the cost function to take into account any outliers that have made the model more complex
What is linear regression?
A line fitted to data points. Line is placed to limit the SSE. Used for continuous problems with an independent and dependent variables.
What is logistic regression?
A classification regression model. It takes inputs into a log function and squashes it. And outputs it as 0 or 1 etc.
What is optimisation in machine learning?
Finding the right parameters in order to minimise a cos t function. A popular one is gradient descent which looks for local minima.
What is the learning rate?
How steep the parameters are changing in gradient descent.
What is the curse of dimensionality?
The more dimensions (features) you have, the more data you need to build good models (generalize). More dimensions (features) doesn’t exactly mean better classification. There is an optimum number, beyond this you can get overfitting.
What is the bias/variance tradeoff?
Hard for a model to be complex enough and simple enough at the same time. (underfitting/overfitting balance).
Simple model = high bias and low variance
Complex model = low bias and high variance