SUPERVISED Flashcards
what is generalization?
Amodel’sability to make correct predictions on new, previously unseen data.
what does generalization refers to?
the model’s capability to adapt and react properly to previously unseen, new data that has the same characteristics as the training set
generalizationexamines how well a model can digest new data and make correct predictions after getting trained on a training set.
what is bias?
If your predictions are consistently off by a certain amount, that’s bias. For example, if you always predict that plants will grow taller than they actually do, you’ve got a bias issue.
what is variance?
is all about inconsistency. If you make predictions that swing wildly from one extreme to another, that’s high variance. For instance, if one day you predict a plant will be super tall and the next day you say it’ll be super short, you’ve got a variance problem.
what is a high bias model?
If your predictions are consistently off by a certain amount, that’s bias. A high-bias model doesn’t adapt well to new information; it’s stuck in its ways.
what is a high variance model?
If you make predictions that swing wildly from one extreme to another, that’s high variance. A high-variance model is too sensitive to small changes in the data; it overreacts.
what is the perfect model for generalization?
low-bias and low-variance
what is overfitting?
If your model learns too much from the data you give it, it might do great on the examples it’s seen but not so great on new ones it hasn’t seen.
when we train a overfit model what happens?
An overfit model gets a low loss during training but does a poor job predicting new data.
what is underfitting?
It happens when your model is too simple to capture the underlying patterns in the data. It’s like trying to fit a straight line to data that’s actually a curve.
how to improve a overfit model?
by making sure you have enough diverse examples to learn from and using techniques that help your model focus on the big picture.
how to improve a underfit model?
you need to make sure your model is complex enough to capture the important patterns in the data. You might need more features or a more sophisticated model.
what is confusion matrix?
confusion matrix is just a way to organize these results into a table. It helps you see how well your program is doing overall and where it might need improvement. The goal is to have as many true positives and true negatives as possible, and as few false positives and false negatives as possible.
what is true positive?
the model correctly predicts the positive class.
what is true negative?
the model correctly predicts the negative class.
what is false positive?
the model incorrectly predicts the positive class.
what is false negative?
the model incorrectly predicts the negative class.
what is F-1 score?
2 x precision x recall/ precision + recall
what is ROC
receiver operating curve
define ROC
is like a graph that shows how well a classification model performs. It helps us see how the model makes decisions at different levels of certainty.
how does ROC plot the graph
The ROC curve is like a plot that helps you understand this. Here’s how:
True Positive Rate (Sensitivity): This is like how good your program is at correctly telling you to bring an umbrella when it’s actually going to rain. It’s the proportion of rainy days that your program correctly predicts.
False Positive Rate: This is like how often your program tells you to bring an umbrella when it’s not going to rain. It’s the proportion of non-rainy days that your program incorrectly predicts as rainy.
The ROC curve plots the True Positive Rate (Sensitivity) against the False Positive Rate.
what is AUC?
Area Under the Curve