Lecture 4 - Supervised Machine Learning: KNN and Regression Flashcards

1
Q

What is K-Nearest Neighbors? What are some of its characteristics?

A

K-Nearest Neighbors (KNN) is an algorithm that can be used for both classification and regression.

It is non-parametric: It doesn’t learn a model (It doesn’t care about the distribution of the data, and does not make any assumptions)

It has heavy memory consumption and computational cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain how the KNN algorithm works

A

Given a training dataset X and a new instance x

Find k points in X that are closest to x

Using the selected distance measure:
Predict label for x
- Classification: Majority vote among the k nearest neighbors
- Regression: Mean of the k nearest neighbors

{Visual examples in Notion}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can you choose K?

A

Some approaches include:
- k = sqrt(n)

  • Loop over different values of k and compare errors similarly as with the elbow method (Sippo’s favorite)
  • Use an odd k if number of classes is even
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the advantages and disadvantages of KNN?

A

Advantages:

  • Nonparametric
  • Easy to interpret

Disadvantages:

  • All features have equal importance (unlike e.g. decision trees
  • With very large datasets computing distances between datapoints is infeasible

Features need to be scaled:

  • A feature with a big scale can dominate all the distances
  • A feature with a small scale can get neglected

The “curse of dimensionality”

  • Problems with high-dimensional spaces (e.g. more than 10 features)
  • Volume of space grows exponentially with dimensions
  • Need exponentially more points to ‘fill’ a high-dimensional volume or you might not have any training points “near” a test point
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you calculate the mean squared error?

A

The mean squared error (MSE) is calculated by:

MSE = 1/N*∑ (prediction_i - actual_i)^2

Where:

  • ∑ is the summation operator
  • (actual) is the actual value
  • (prediction) is the predicted value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you calculate Root Mean Squared Error

A

sqrt(MSE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some considerations you must have when working with linear regression?

A

Relationships between the dependent and independent variable(s) is not always linear

It is possible to transform data so that it will have a linear relationshop (e.g. log transform)

Collinearity: Correlation between features

Recommended additional models to study: Ridge and Lasso Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Logistic Regression?

A

Logistic Regression is when the dependent variable is binary (0/1 or “Yes”/”No”)
In other words, it solves binary classification problems. E.g . is an email spam or not spam

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does Logistic Regression compute?

A

The Logistic Regression model computes a weighted sum of the input features (Plus a bias term), but instead of outputting the result directly (like in Linear Regression), it outputs the logistic of this result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the logistic function?

A

The logistic function is a sigmoid function, which takes any real input and outputs a value between zero and one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for the logistic function?

A

The logistic function is expressed as the following function:

f(x) = 1/(1 + e^-x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some considerations one must have when doing logistic regression?

A
  • Standardization or scaling of data is not needed
  • Very efficient, light computationally
  • Works better with datasets that have many observations
  • There are very many different alternative ML algorithms for binary classification problems (e.g. SVM, Decision trees, Random Forests)
  • There are variants of the logistic regression that support multiclass classification (e.g. Softmax regression / multinomial logistic regression)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

TRUE OR FALSE: K-Nearest Neighbors can be used with both regression and classification problems

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

TRUE OR FALSE: Logistic regression is a heavy and not that efficient algorithm for binary classification problems

A

FALSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

TRUE OR FALSE: Linear Regression works well when there is a linear relationship between the dependent and independent variable(s)

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the distances used in KNN?

A

Euclidian and Manhattan

17
Q

Swipe to see the logic behind KNN

A

Given a training set X and a new instance xnew, find K points in X that are closest to xnew. Using the selected distance measure, predict a new label for xnew by majority vote (classification) or mean (regression)

18
Q

What are some limitations of KNN?

A

all features have equal importances (unlike decision trees), computing distance is high with big datasets, features usually need to be scaled

19
Q

What does regression do and how?

A

Regression makes predictions of continuous variables.

By teaching the model a correlation/s between one or many independent variables (x) and a dependent variable (Y)

20
Q

Point out some differences between classification and regression.

A
  1. Classification is a task of predicting a discrete class label | Regression is a task of predicting a continuous quantity
  2. In classification, one tries to find the boundary between two classes | In regression, one tries to find the line that explains the relationship between variables
Overlap
3. A classification algorithm may predict a continuous value in the form of a probability for a class label | A regression algorithm may predict a discrete value in the form of an integer quantity

Evaluation
4. Classification predictions can be evaluated using accuracy | Regression predictions can be evaluated using root mean square error

21
Q

How do you call a relationship between two variables when y increases as x increases?

How do you call a relationship between two variables when y decreases as x increases?

A

Positive

Negative

22
Q

What does collinearity mean and why is it bad in linear regression?

A

Collinearity of features happens when one feature is highly correlated to another feature in a regression model

The problem is that it reduces the precision of the estimated coefficients which weakens the statistical power of the regression model

e.g. if you regress Y against X and Z (which are highly correlated between each other), then the effect of X on Y is hard to distinguish from the effect of Z on Y because any increase in X tends to be associated with an increase in Z

23
Q

To fit the best line in a linear regression model you use …

A

… least squares

Sum of squared differences (or residuals) needs to be minimized to find a line that fits the data as well as possible

24
Q

To fit (and optimize) the sigmoid line in a logistic regression you use …

(unlike linear regression where you use least square)

A

… maximum likelihood estimation

25
Q

How is KNN trained and how do you validate the quality of the model?

A

TRICK QUESTION. for KNN, there is no training step because there is no model to build. … In other words, because no model is built, there is nothing to validate. But you can still test–i.e., assess the quality of the predictions using data in which the targets (labels or scores) are concealed from the model