MODULE 2 S2.1 Flashcards

kNN

1
Q

Strengths / Advantages of k-NN

A
  • Easy to understand
  • Works well without any special adjustments
  • Suitable as a first-time models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When considering more than one neighbor, we use _________ to assign a label.

A

voting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Building the model consists only of storing the training dataset.

A

k-Nearest Neighbors (k-NN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

We import the _____________ class for the k-NN regression variant.

A

KNeighborsRegressor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Weakness / Disadvantages of k-NN

A
  • If the number of features or samples is large, the prediction is slow and data preprocessing is important.
  • Does not work well with sparse datasets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In k-NN, to make a prediction for a new data point, the algorithm finds the closest data points in the training dataset—its ________________

A

nearest neighbors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

T/F In its SIMPLEST version, the k-NN algorithm only considers exactly one nearest neighbor, which is the closest training data point to the point we want to make a prediction for.

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The Squared Score (R^2), also known as the _______________

A

Coefficient of Determination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

It is the default distance used to choose the right measure.

A

Euclidean distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

It is arguably the simplest machine learning algorithm.

A

k-Nearest Neighbors (k-NN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

T/F In its SIMPLEST version, the k-NN algorithm can consider more than 1 nearest neighbors.

A

FALSE (exactly 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

It is a measure of goodness of a prediction for a regression model, and yields a score between 0 and 1.

A

Squared Score (R^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

T/F Instead of considering only the closest neighbor, we can also consider an arbitrary number, k, of neighbors.

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Parameters of the k-NN Classifier

A
  • number of neighbors (k)
  • how you measure distance between data points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

T/F Predicting worse than the average can result in negative numbers

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T/F In k-NN, High Model Complexity is underfitting.

A

FALSE

17
Q

T/F K-nearest neighbors make a prediction for a new data point by finding the data that match from the training dataset.

A

FALSE

18
Q

In k-NN, Low Model Complexity is:

A

Underfitting

19
Q

T/F In k-NN, when you choose a small value of k (e.g., k=1), the model becomes more complex.

A

TRUE

20
Q

T/F There is a regression variant of the k-nearest neighbors algorithm.

A

TRUE

21
Q

In k-NN, High Model Complexity is:

A

Overfitting

22
Q

T/F The ‘k’ in k-Nearest neighbors refers to the new closest data point.

A

FALSE

23
Q

T/F In k-NN, Euclidean distance (by default) is used to choose the right distance measure.

A

TRUE