Supervised Learning Flashcards
What is supervised learning?
A subcategory of M.L. defined by the use of labeled input/output sets.
What is the difference between regression and classification?
Regression is used to predict continuous values such as price or income. The goal is to find a best-fit line. Classification is used to predict a discrete class label, goal: decision boundary.
What kind of problems can you solve with classification and regression?
Regression: weather prediction, housing price prediction
Classification: spam detection, speech recognition, cancer cell identification.
Why is training set error performance unreliable?
Doesn’t generalize to unseen data. Perfect training set performance equals overfitting.
What is machine learning?
A field of artificial intelligence concerned with algorithms that can learn from data.
Two main branches of Machine Learning?
Supervised learning
Unsupervised learning
Two main branches of Machine Learning?
Supervised learning
Unsupervised learning
3 requirements for machine learning?
1) A pattern exists
2) that cannot be pinned down mathematically
3) We have data on it
Define data (for M.L)
Input - correct output pairs (feature, label)
input - real-valued or categorial
output - real-valued (regression) or categorical (classification)
Goal of supervised learning?
To model dependency between features and labels.
Goal of a supervised learning model?
To predict labels for new instances.
What is a training set?
A set of input - output pairs.
Classification output value types?
Categorical or binary (-1,1)
Regression output value type?
Real numbers.
Examples of supervised learning problems?
Junk mail:
features - word frequencies
class - junk/not junk
Access Control System:
features - images
class - ID of the person
Medical diagnosis:
features: BMI, age, symptoms, test results
class: diagnostic code