General Flashcards
What does the superscript within parentheses mean?
e.g. X(i)
The ith row of X
What does the symbol ∀ mean?
“for all” or “for any”
In a joint probability distribution table, how many rows are there?
One for each possible combination of variable values
Given a joint probability distribution, what can we calculate?
Conditional or joint probabilities over any subset of the variables
What’s the conditional probability equation?
- What is bias?
- What is variance?
- The inability for a ML method to capture the true relationship
What does high bias correspond to? (2)
High bias—-> underfitting —–> More train set error
What does high variance correspond to?
High variance —–> overfitting —–> More dev set error and more test set error
What is the development set (aka dev set)?
It’s another term for “validation set”
What’s a useful way to think of bias?
how well does my model fit the training data?
What’s a useful way to think of variance?
how well does my model generalize to unseen data sets?
Can you have high bias and high variance?
Yes
Given training error and validation error, how can you assess the bias and variance?
- Training error can tell you the bias.
- How much higher your validation error is than the training error can tell you the variance.
Why can’t we use the validation set for testing performance?
- Because we used the validation set to tune our model parameters. If we tested on the validation set, we wouldn’t know whether those performance gains from tuning were beneficial for unseen data, or just improved our performance on the validation set
- What is the validation set?
- What is the test set?
- Set you use to pick the best parameters or model to use
- Only used to get a metric of how well your model is performing on unseen data