Lecture 4 - Supervised Machine Learning: KNN and Regression Flashcards by Todd Brackman

What is K-Nearest Neighbors? What are some of its characteristics?

K-Nearest Neighbors (KNN) is an algorithm that can be used for both classification and regression.

It is non-parametric: It doesn’t learn a model (It doesn’t care about the distribution of the data, and does not make any assumptions)

It has heavy memory consumption and computational cost

How well did you know this?

Not at all

Perfectly

Explain how the KNN algorithm works

Given a training dataset X and a new instance x

Find k points in X that are closest to x

Using the selected distance measure:
Predict label for x
- Classification: Majority vote among the k nearest neighbors
- Regression: Mean of the k nearest neighbors

{Visual examples in Notion}

How well did you know this?

Not at all

Perfectly

How can you choose K?

Some approaches include:
- k = sqrt(n)

Loop over different values of k and compare errors similarly as with the elbow method (Sippo’s favorite)
Use an odd k if number of classes is even

How well did you know this?

Not at all

Perfectly

What are the advantages and disadvantages of KNN?

Advantages:

Nonparametric
Easy to interpret

Disadvantages:

All features have equal importance (unlike e.g. decision trees
With very large datasets computing distances between datapoints is infeasible

Features need to be scaled:

A feature with a big scale can dominate all the distances
A feature with a small scale can get neglected

The “curse of dimensionality”

Problems with high-dimensional spaces (e.g. more than 10 features)
Volume of space grows exponentially with dimensions
Need exponentially more points to ‘fill’ a high-dimensional volume or you might not have any training points “near” a test point

How well did you know this?

Not at all

Perfectly

How do you calculate the mean squared error?

The mean squared error (MSE) is calculated by:

MSE = 1/N*∑ (prediction_i - actual_i)^2

Where:

∑ is the summation operator
(actual) is the actual value
(prediction) is the predicted value

How well did you know this?

Not at all

Perfectly

How do you calculate Root Mean Squared Error

sqrt(MSE)

How well did you know this?

Not at all

Perfectly

What are some considerations you must have when working with linear regression?

Relationships between the dependent and independent variable(s) is not always linear

It is possible to transform data so that it will have a linear relationshop (e.g. log transform)

Collinearity: Correlation between features

Recommended additional models to study: Ridge and Lasso Regression

How well did you know this?

Not at all

Perfectly

What is Logistic Regression?

Logistic Regression is when the dependent variable is binary (0/1 or “Yes”/”No”)
In other words, it solves binary classification problems. E.g . is an email spam or not spam

How well did you know this?

Not at all

Perfectly

What does Logistic Regression compute?

The Logistic Regression model computes a weighted sum of the input features (Plus a bias term), but instead of outputting the result directly (like in Linear Regression), it outputs the logistic of this result

How well did you know this?

Not at all

Perfectly

What is the logistic function?

The logistic function is a sigmoid function, which takes any real input and outputs a value between zero and one

How well did you know this?

Not at all

Perfectly

What is the formula for the logistic function?

The logistic function is expressed as the following function:

f(x) = 1/(1 + e^-x)

How well did you know this?

Not at all

Perfectly

What are some considerations one must have when doing logistic regression?

Standardization or scaling of data is not needed
Very efficient, light computationally
Works better with datasets that have many observations
There are very many different alternative ML algorithms for binary classification problems (e.g. SVM, Decision trees, Random Forests)
There are variants of the logistic regression that support multiclass classification (e.g. Softmax regression / multinomial logistic regression)

How well did you know this?

Not at all

Perfectly

TRUE OR FALSE: K-Nearest Neighbors can be used with both regression and classification problems

TRUE

How well did you know this?

Not at all

Perfectly

TRUE OR FALSE: Logistic regression is a heavy and not that efficient algorithm for binary classification problems

FALSE

How well did you know this?

Not at all

Perfectly

TRUE OR FALSE: Linear Regression works well when there is a linear relationship between the dependent and independent variable(s)

TRUE

How well did you know this?

Not at all

Perfectly

What are the distances used in KNN?

Study These Flashcards

Euclidian and Manhattan

Swipe to see the logic behind KNN

Study These Flashcards

Given a training set X and a new instance xnew, find K points in X that are closest to xnew. Using the selected distance measure, predict a new label for xnew by majority vote (classification) or mean (regression)

What are some limitations of KNN?

Study These Flashcards

all features have equal importances (unlike decision trees), computing distance is high with big datasets, features usually need to be scaled

What does regression do and how?

Study These Flashcards

Regression makes predictions of continuous variables.

By teaching the model a correlation/s between one or many independent variables (x) and a dependent variable (Y)

Point out some differences between classification and regression.

Study These Flashcards

Classification is a task of predicting a discrete class label | Regression is a task of predicting a continuous quantity
In classification, one tries to find the boundary between two classes | In regression, one tries to find the line that explains the relationship between variables

Overlap
3. A classification algorithm may predict a continuous value in the form of a probability for a class label | A regression algorithm may predict a discrete value in the form of an integer quantity

Evaluation
4. Classification predictions can be evaluated using accuracy | Regression predictions can be evaluated using root mean square error

How do you call a relationship between two variables when y increases as x increases?

How do you call a relationship between two variables when y decreases as x increases?

Study These Flashcards

Positive

Negative

What does collinearity mean and why is it bad in linear regression?

Study These Flashcards

Collinearity of features happens when one feature is highly correlated to another feature in a regression model

The problem is that it reduces the precision of the estimated coefficients which weakens the statistical power of the regression model

e.g. if you regress Y against X and Z (which are highly correlated between each other), then the effect of X on Y is hard to distinguish from the effect of Z on Y because any increase in X tends to be associated with an increase in Z

To fit the best line in a linear regression model you use …

Study These Flashcards

… least squares

Sum of squared differences (or residuals) needs to be minimized to find a line that fits the data as well as possible

To fit (and optimize) the sigmoid line in a logistic regression you use …

(unlike linear regression where you use least square)

Study These Flashcards

… maximum likelihood estimation

How is KNN trained and how do you validate the quality of the model?

TRICK QUESTION. for KNN, there is no training step because there is no model to build. ... In other words, because no model is built, there is nothing to validate. But you can still test--i.e., assess the quality of the predictions using data in which the targets (labels or scores) are concealed from the model

Lecture 4 - Supervised Machine Learning: KNN and Regression Flashcards

(25 cards)