Week 3 Flashcards

Question 1

Q

Nearest Neighbor Algorithm

Answer

A

Nearest Neighbor classifiers are defined by their characteristics of classifying unlabeled cases based on the similarity of labeled cases
for example, if user A and B exhibit the same purchasing behavior the items purchased by A are also recommended to user B.
Nearest neighbor algorithms are used for
- recommendations
- identifying patterns in data
suited when the data type is homogenous

Question 2

Q

K-NN

Answer

A

Question 3

Q

Strengths of KNN

Answer

A

Question 4

Q

Weakness of KNN

Answer

A

does not produce a model, limiting the ability to understand how the feature is related to the class
requires selection of an appropriate k
slow classification phase
nominal features and missing data required additional processing

Question 5

Q

Letter K in K-NN Algorithm

Answer

A

the letter is K is a variable term is implying the numbers of neighbors that could be used.
for each unlabeled record in the test dataset, k-NN identifies k records in the training data that are the “nearest” in similarity

Question 6

Q

Euclidean Distance

Answer

A

Euclidean distance is used to measure the similarity between two instances
where p and q are the examples to be compared
each having n features
the term p1 refers to the value of the first feature of example p, while q1 refers to the value of the first feature of example q

Question 7

Q

Choosing a value for K in K-NN

Answer

A

k value determines how well the model will generalize to future data
the balance between over lifting and underlighting the training data is a problem known as bias-variance tradeoff
choosing a large k reduces the impact or variance caused by noisy data, but can bias the learner so that it runs the risk of ignoring small but important patterns
choosing smaller values for K can be noisy and will have a higher influence on the result.

Question 8

Q

commonly used techniques

Answer

A

k=sqrt(N) where N stands for the number of samples in your training dataset

(8 cards)