Week 10 - KNN Flashcards

1
Q

What are features?

A

Features are individual measurable properties or characteristics of data used in machine learning models to make predictions or classifications. They represent the input variables that the model uses to learn patterns and relationships within the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give an example of how the KNN algorithm can be used.

A

An example of using the KNN algorithm is classifying a new fruit based on its size and color. By plotting known fruits like grapefruits and oranges on a graph with these attributes, you can classify a new, mysterious fruit by finding its closest neighbors in the dataset. If the majority of these nearest neighbors are oranges, the KNN algorithm would classify the new fruit as an orange.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is KNN?

A

The k-nearest neighbors (KNN) algorithm is a classification method that assigns a label to a new data point based on the majority label of its k closest neighbors in the feature space. It classifies data by comparing the distance between the new point and existing points in the dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can the distance between two points be calculated in relation to KNN?

A

To use KNN, you need to determine what “nearest” means, typically by calculating Euclidean distance. This distance metric measures the straight-line distance between two points in the feature space, helping to identify the closest neighbors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the formula for Euclidean distance?

A

For two points (\mathbf{p} = (p_1, p_2)) and (\mathbf{q} = (q_1, q_2)) in a 2-dimensional space, the Euclidean distance is calculated using the formula:

[ \text{Euclidean Distance} = \sqrt{ (p_1 - q_1)^2 + (p_2 - q_2)^2 } ]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is KNN for classification?

A

KNN for classification is a method that assigns a category to a new data point based on the majority vote from its k-nearest neighbors in the feature space. The class that appears most frequently among the closest neighbors determines the classification of the new point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is KNN for regression?

A

KNN for regression predicts a value for a new data point by averaging the target variable values of its k-nearest neighbors. The predicted value is based on the average of the target values from the closest data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference between KNN for classification and KNN for regression?

A

The difference between KNN for classification and KNN for regression lies in their objectives:

KNN for Classification: Assigns a class label based on the majority vote of the k-nearest neighbors.

KNN for Regression: Predicts a continuous value by averaging the target values of the k-nearest neighbors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the importance of feature selection for KNN?

A

Feature selection is crucial for KNN because the choice of features directly affects the algorithm’s performance. Relevant features ensure accurate predictions, while irrelevant or biased features can lead to poor results. Properly selecting features helps in building a model that reflects true preferences or behaviors, enhancing the effectiveness of the KNN algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does feature extraction involve in data analysis?

A

Feature extraction involves converting an item, such as a fruit or a user, into a list of numbers. These numerical representations can then be compared and analyzed to identify patterns or make predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly