Supervised Learning Flashcards by Mari Weyand

What is supervised learning?

A subcategory of M.L. defined by the use of labeled input/output sets.

How well did you know this?

Not at all

Perfectly

What is the difference between regression and classification?

Regression is used to predict continuous values such as price or income. The goal is to find a best-fit line. Classification is used to predict a discrete class label, goal: decision boundary.

How well did you know this?

Not at all

Perfectly

What kind of problems can you solve with classification and regression?

Regression: weather prediction, housing price prediction
Classification: spam detection, speech recognition, cancer cell identification.

How well did you know this?

Not at all

Perfectly

Why is training set error performance unreliable?

Doesn’t generalize to unseen data. Perfect training set performance equals overfitting.

How well did you know this?

Not at all

Perfectly

What is machine learning?

A field of artificial intelligence concerned with algorithms that can learn from data.

How well did you know this?

Not at all

Perfectly

Two main branches of Machine Learning?

Supervised learning
Unsupervised learning

How well did you know this?

Not at all

Perfectly

Two main branches of Machine Learning?

Supervised learning
Unsupervised learning

How well did you know this?

Not at all

Perfectly

3 requirements for machine learning?

1) A pattern exists
2) that cannot be pinned down mathematically
3) We have data on it

How well did you know this?

Not at all

Perfectly

Define data (for M.L)

Input - correct output pairs (feature, label)
input - real-valued or categorial
output - real-valued (regression) or categorical (classification)

How well did you know this?

Not at all

Perfectly

Goal of supervised learning?

To model dependency between features and labels.

How well did you know this?

Not at all

Perfectly

Goal of a supervised learning model?

To predict labels for new instances.

How well did you know this?

Not at all

Perfectly

What is a training set?

A set of input - output pairs.

How well did you know this?

Not at all

Perfectly

Classification output value types?

Categorical or binary (-1,1)

How well did you know this?

Not at all

Perfectly

Regression output value type?

Real numbers.

How well did you know this?

Not at all

Perfectly

Examples of supervised learning problems?

Junk mail:
features - word frequencies
class - junk/not junk

Access Control System:
features - images
class - ID of the person

Medical diagnosis:
features: BMI, age, symptoms, test results
class: diagnostic code

How well did you know this?

Not at all

Perfectly

Formal components of learning.

Study These Flashcards

Input (x) - e.g. customer application
Output (y) - (approval/denial of application)
Target function: x -> y (ideal credit approval formula)
Data {(x1, y1), … (xn, yn)} (historical records)
Hypothesis: g; X -> Y
Hypothesis set (H): group of functions where we look for our solution
Supervised learning uses test data to learn this function from H that can be applied to new data.

Building blocks for an M.L. algorithm?

Study These Flashcards

Model class (hypothesis set) e.g.
-linear or quadratic function
-decision tree
-neural network, clustering
Error measure (Score function)
Algorithm - good model defined by the score function
Validation

Dangers of overfitting

Study These Flashcards

The model memorizes training data and does not generalize beyond it.
100% accuracy on training data, can’t do better than random guessing on new instances.

Dangers of underfitting

Study These Flashcards

Model not expressive enough, for ex. linear functions on non-linear problems.

Approximation-Generalization tradeoff

Study These Flashcards

Goal: to approximate target function as closely as possible.
More complex hypothesis set: better chance of approximating target function f
Less complex hypothesis set: better chance of generalizing f outside of the training set

Ideal hypothesis set H

Study These Flashcards

H = {f}, we already know the target function, no need for M.L.

Occam’s Razor

Study These Flashcards

the principle that favors the simplest hypothesis (set) that can well explain a given set of observations.

Criteria for a good model

Study These Flashcards

Interpretability
Computational complexity

How to control Hypothesis set complexity?

Study These Flashcards

With hyperparameters.
-max degree of polynomials
-no of nearest neighbors
-regularization parameter
-depth of decision tree

What kind of methods should you start with?

Simple: linear regression, kNN, naive bayes - easier to understand -less tuning, less risk of overfitting -often just as good as more advanced methods.

K Nearest Neighbors

-Classic method (1951) -Classification based on k most similar training instances -parameter k tunes model complexity -can learn complex non-linear functions

kNN Classification

Choose the majority class among k nearest neighbors for prediction.

kNN Regression

Take the mean value of k nearest neighbors for prediction

Disadvantages of kNN predictor

All k nearest neighbors have the same influence on prediction. Maybe closer neighbors should have more influence?

Distance measure in kNN

Standard: euclidean distance Others: Manhattan, Mahalanobis, Chebyshev, Hamming

Small vs large k in kNN

small: local complex model, depends on a handful of instances large: global, simpler model, averaged over large set of instances

kNN, k=1?

Overfitting! 0% training error, but won't generalize

Advantages of kNN

-simple -non-linear modeling -simple model complexity tuning (k) -customizable (distance measure, feature/neighbor weighing) -good results in many applications

Disadvantages of KNN

-Large computational/memory complexity (O(nm) where m is the dimensionality of the data) -sensitive to scaling -Irrelevant features problematic -black box- -not state of the art

Criteria to be balanced in learning.

Fit to data (low error) vs model complexity

4 main ingredients of a kNN algorithm.

Distance metric Number of neighbors (k) Weighting function for neighbors Prediction function

Method to automatically determine the appropriate k value for kNN.

Cross-validation.

Supervised Learning Flashcards

(37 cards)