Lecture 2 Flashcards by Andy Rice

k-Nearest Neighbors (k-NN)

Given a set of labeled instances (training set), new instances (testing set) are classified according to the majority label of the k nearest neighbors

How well did you know this?

Not at all

Perfectly

Decision Boundary

A model of the separation between two classes. Can be a straight or wiggly line

How well did you know this?

Not at all

Perfectly

What is the complexity of the k-NN model proportional to?

The complexity is proportional to the wigglyness of the decision boundry. The more complex, the more wiggly

How well did you know this?

Not at all

Perfectly

What does a model do with data in a classification problem?

In classification, a model trained from data defines a decision boundry that separates the data

How well did you know this?

Not at all

Perfectly

What does a model do with data in a regression problem?

In regression, a model fits data to describe the relation between i) 2 features or ii) a feature and the label

How well did you know this?

Not at all

Perfectly

What happens when there is an equal number of classes of the nearest neighbors (a tie) or two neighbors are equidistant to the new data point?

Either a random selection of the class assigned to the new point is made or how large k is changes to break either the class or distance tie

How well did you know this?

Not at all

Perfectly

What is the label (class) of a point on the decision boundary?

It’s ambiguous

How well did you know this?

Not at all

Perfectly

Uniform Weighted k-NN

When the majority class of nearest neighbors determines the class of the new data point

How well did you know this?

Not at all

Perfectly

Distance Weighted k-NN

each neighbor has a weight which is based on its distance to the new data point

How well did you know this?

Not at all

Perfectly

Inverse Distance Weighted k-NN

each neighbor has a weight which is based on the inverse of its distance to the new data point so closer neighboring points have a higher vote

How well did you know this?

Not at all

Perfectly

What are two types of Kernel Functions

Gaussian kernel (bell curve) and tricube kernel

How well did you know this?

Not at all

Perfectly

Euclidean distance

A straight line

How well did you know this?

Not at all

Perfectly

Manhattan distance

Distance between two projections on the axis (you can’t walk through walls/buildings, you have to go around them)

How well did you know this?

Not at all

Perfectly

When does a k-NN model have a danger of overfitting?

When k is too low; the model has a high complexity

How well did you know this?

Not at all

Perfectly

When does a k-NN model have a danger of underfitting?

When k is too high; the model has a low complexity

How well did you know this?

Not at all

Perfectly

How do you determine the model complexity?

Depends on the complexity of the separation between classes. Start with the simplest model (large k) and increase complexity (smaller k)

How do you choose k?

Typically an odd number k for an even number of classes. The data-miner’s rule of thumb is k = sqrt(n)

Nearest centroid classification

Takes a new sample, and compares it to each of these class centroids. The class 
whose centroid it is closest to, in squared distance, is the predicted class for that new 
sample.

Nearest shrunken centroid classification

“shrinks” each of the class centroids toward the overall centroid for all classes by an amount called the threshold . This shrinkage consists of moving the centroid towards zero by threshold, setting it equal to zero if it hits zero.

After shrinking the centroids, the new sample is classified by the usual nearest centroid rule, but using the shrunken class centroids.

k-NN Advantages

• The cost of the learning process is zero
• No assumptions about the characteristics of the concepts to learn have
to be done
• Complex concepts can be learned by local approximation using simple
procedures

k-NN Disadvantages

•The model can not be interpreted (there is no description of
the learned concepts)
•It is computationally expensive to find the k nearest neighbors
when the dataset is very large
•Performance depends on the number of dimensions that we have (curse of dimensionality)

Curse of Dimensionality

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.

For example: when the dimensionality increases, the volume of the space increases so fast that the available data become sparse.

When does the k-NN algorithm require more computation?

The k-NN algorithm requires more computation for testing than for training.

What does training the kNN algorithm consist of?

only storing the data

What does testing the kNN algorithm consist of?

Testing involves comparing every test instance to all the instances in the training set and calculating which training instances are closest, before assigning a class label.

Is the kNN algorithm used for classification or regression?

It can be used for both classification and regression.

What is the relationship between k and the complexity of the model?

As you increase k, the model gets less complex (risk of overfitting decreases).

If the model preforms well of the training data but poorly on the test data, what's the issue?

The model is overfitting

Can k in kNN be negative or a float?

No, k is always a positive integer