Chapter 7- Distance Based Methods Flashcards

1
Q

give the k nearest neighbour algorithm

A

for each test sample x:
find the k most similar training examples
predict x to be whatever the most common label is among those k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is hamming distance?

A

a distance method for categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how is hamming distance calculated?

A

counts the non equal entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the main disadvantage of KNN?

A

computationally intensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

give two modifications to knn

A

condensed knn

octree data structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what feature properties can severely affect knn (2)?

A

scaling of the features

presence of irrelevant features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A large value of k in a KNN model is likely to …fit

A

under

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does the triangle inequality say

A

the sum of any two sides of a triangle is greater than or equal to the third side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

give the triangle inequality

A

|| x+y || <= ||x|| + ||y||

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

three problems with distance based methods

A

Tractability - each point requires a distance calculation and then ordering

Scaling - feature scales make a big difference

Does distance make sense?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly