Week 2 Flashcards
What are these: Classification, regression
Supervised learning
What are these: Clustering, outlier detection
Unsupervised learning
Classification predicts ________ values whereas regression predicts _________ values
discrete values whereas regression predicts continuous values (REGRESSION CAN ALSO TAKE CATEGORICAL VALUES)
Name this: The data has no class labels
Unsupervised learning
Name this: Data has class labels
Supervised learning
T/F - You cannot use the same data to test your algorithm, you need an independent test set
True. (Training set and test set)
What is model overfitting?
When you try to get 100% accuracy, this generally means completely tailoring your model to the training data, it will not longer work well on the evaluation data
What is model underfitting?
When the model performs poorly on the training data
Name 2 types of error estimation
- Random sampling with repeated holdout
- K-fold cross-validation
T/F - In K-fold cross validation we split the data into (usually) 10 equal groups, train on the (usually) first 9 groups and test on the last group.
True
How is the overall accuracy found in K-fold cross validation?
Average accuracy rate = overall accuracy
Place these in the correct boxes
TP FN FP TN
P+ P- A+ A-
TP FN
FP TN