Question 1

What are distance based algorithms

Accepted Answer

Distance based algorithms are machine learning algorithms that classify instances by computing distances between these instances and a number of internally stored exemplars.

Question 2

____ that are closest to the instance have the largest influence on the classification assignment to the instance

Accepted Answer

Exemplars

Question 3

What is hamming distance

Accepted Answer

Hamming distance between two strings or vectors of equal length is the number of positions at which the corresponding symbols are different. 110 and 101 hamming distance 2.

Question 4

What is 0-norm,1-norm and 2-norm give examples.

Accepted Answer

0. Hamming distance 1. Manhattan distance 2. Euclidean distance

Question 5

What is the Chebyshev distance

Accepted Answer

The Chebyshev distance, also known as the maximum or L∞ distance, between two points in a space is the greatest of their differences along any coordinate dimension.

Question 6

what are the 4 distance metric conditions

Accepted Answer

1. Distances between a point and itself are 0 2. All other distances are larger than zero if x!=y. 3. Distances are symmetric 4. Detours can not shorten the distance (tringle inequality)

Question 7

What are exemplars

Accepted Answer

Exemplars are prototypical instances within clusters/classes

Question 8

What are the two exemplars

Accepted Answer

1. Centroid 2. Medoid

Question 9

Which one happens in data and which one doesnt

Accepted Answer

Centroids do not happen in data Medoids happen in data, more time consuming to calculate

Question 10

Since the number of classes is typically much lower than the number of exemplars, decision rules often take more than one nearest exemplar into account (or k-nearest exemplar)

Accepted Answer

true

Question 11

What is the curse of dimentionality

Accepted Answer

In high-dimentional spaces everything is far away from everything and so pairwise distances are uninformative

Question 12

list nearest-neighbour classifier properties

Accepted Answer

1. Nearly perfect seperation of classes on the training set 2. Easily adapted to real-valued targets and structured objects 3. Unbalanced complexity 4. Perfect seperates training data - by increasing the number of neighbours k we would increase bias and decrease variance

Question 13

what are 2 types of distance-based clustering

Accepted Answer

Predictive clustering Descriptive clustering

Question 14

What is predictive clustering

Accepted Answer

use a distance metric, a way to construct exemplars and a distance-based decision rule to create clusters that are compact with respect to the distance metric

Question 15

what is descriptive clustering

Accepted Answer

trees called dendrograms purely defined in terms of a distance measure are used. they partition the given data rather than the entire instance space.

Lecture 5 - Distance based models Flashcards

(25 cards)