Lecture 4 Flashcards
What is the difference between classification and regression tasks?
Classification is when the output is one of a finite set of values. Regression is when the output is a measured integer or real number.
- Classification = sunny/cloudy/rainy or true/false
- Regression = tomorrow’s temperature
How do we choose a GOOD hypothesis space?
Choose the one that fits your model the best.
What is induction?
Going from a specific set of observations to a general rule. We assume that we can apply our model to future cases (e.g., image recognition). NOTE: Inductive conclusions can be incorrect.
What is a deductive conclusion?
Conclusions that are guaranteed to be correct if the premises are correct.
What are the 3 types of Learning?
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
How do we choose a hypothesis space?
If you don’t have some prior knowledge about the process that generated the data, you perform exploratory data analysis to determine which hypothesis space is appropriate. (Or use trial and error.)
How do we choose a good hypothesis from within the hypothesis space?
Look for a best-fit function for which each h(xi) is close to yi, which is the case if h generalizes well with the test set.
How can we perform exploratory data analysis?
Examine the data with statistical tests and visualizations (such as histograms, scatter plots, box plots, etc.)
What is the true measure of a hypothesis?
How well it handles input it has not yet seen (e.g., test set), not how it does on the training set.
Define bias.
The tendency of a predictive hypothesis to deviate from the expected value when averaged over different training sets.
// or //
A model’s tendency to resist change. High bias == highly resistant to change (e.g. linear model).
Define variance.
The model’s magnitude of change.
The amount of change in the hypothesis due to fluctuation in the training data.
When is a hypothesis underfitting?
When it fails to find a pattern in the data.
When is a hypothesis overfitting?
When it performs poorly on unseen data because it pays too much attention to a particular data set it is trained on
Bias-variance tradeoff
A choice between:
1. more complex, low-bias hypotheses that fit the training data well
2. simpler, low-variance hypotheses that may generalize better
What is Ockham’s Razor Principle?
Choose the simplest hypothesis that matches the data because there is often a bias-variance tradeoff.