Non-parametric classification Flashcards

Question 1

Q

Difference between parametric and non-parametric models

Answer

A

Parametric model:
a model parametrization is assumed, and the data are used to find the best value of the parameters

Non-parametric model:
No structure is assumed for the hypothesis set; classification is obtained by other criteria, for example similarities among data

Question 2

Q

Why is non-parametric classification useful?

Answer

A

Sometimes, data are not linearly separable, so linear classification can’t be used; moreover, finding a non-linear transform may be difficult. In this cases, a non-parametric model can be used for classification.

Question 3

Q

Nearest neighbor: rule

Answer

A

given a dataset
{(x1,y1),…,(xN,yN)}
where yn = +-1 for any n

to classify a new point x, find the nearest point to x in the dataset and apply its label to x*.

This simple method gives a Voronoi tessellation the regressor space, that is much more complex that the linear division given by the perceptron.

Question 4

Q

How to define a measure of distance between points

Answer

A

Euclidean distance:
d(x, x’) = || x - x’ ||

Weighted Euclidean distance:
d(x, x’) = (x - x’)’ Q (x - x’)

Cosine similarity:
d(x, x’) = cosim(x, x’) = x x’ / ( ||x|| * ||x’|| )

Question 5

Q

Nearest neighbor: algorithm

Answer

A

given x*

reorder the data according to the distance from x:
d(x, x[1]) < = d(x, x[2]) < = … < = d(x, x[N])
label associated to x* is
h(x*) = y[1]

Question 6

Q

Nearest neighbor: basic assumptions

Answer

A

there is a nearby point
the target function f(x) is smooth enough, so that the classification of the nearest neighbor is indicative of the classification of x*

Question 7

Q

k-nearest neighbors: need

Answer

A

Since data are noisy, relying on a single point is not recommended, so a k-NN algorithm can be used

Question 8

Q

k-nearest neighbors: rule

Answer

A

it’s a generalization of the nearest-neighbor
h(x) = sign( sum(i=1,K) y[i] (x) )
> the label of a new point is the same as the one of the majority of the K nearest points.

K=1 - > nearest neighbor
K=N - > the label of a new point x* is that of the majority of the points in the dataset

Question 9

Q

Theoretical theorem on the choice of K

Answer

A

For N - > +inf , if K(N) - > +inf and K(N)/N - > 0
then
1) Ein(h) - > Eout(h)
2) Eout(h) - > Eout* minimum achievable out-of-sample error

Question 10

Q

Practical rule for the choice of K

Answer

A

K(N) = floor( sqrt(N) )

Question 11

Q

Pros and cons of K-NN

Answer

A

No training! - > no training cost
High testing cost for prediction
memory requirement is O(Nd)
computational complexity is O(Nd + N*logK)
Exploratory data analysis deals with the efficient solution of K-NN problems