Lec1 - Introduction Flashcards

1
Q

What are the three main types of Machine Learning?

A

Supervised Learning
Unsupervised Learning
Reinforcement Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three types of labels in Supervised Learning?

A

Categorical
Real-valued Scalar (Regression)
Ordinal Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two ways of performing Unsupervised Learning?

A

Dimensionality Reduction

Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name the main problems in Machine Learning.

A
Classification
Regression
Clustering
Dimensionality Reduction
Density Reduction
Policy Search
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the K-Nearest Neighbours Algorithm.

A

Given a new data point, we consider the k nearest neighbours (in feature space) and assign the class in the majority (eg if k = 5 and we have 3 elems of class 2 and 2 of class 1, this instance is labelled class 2)

The distance metric used is usually the L2 norm (Euclidian distance).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give the formula for Euclidian distance of an instance point x_q and training data points x_i.

A

d(x_i, x_q) = sqrt(Sum_g (a_g(x_i) - a_g(x_q)) ^ 2 )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give the names of three different distance metrics.

A

Manhattan (L1 Norm)
Euclidian (L2 Norm)
Chebyshev (L-Infinity Norm)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the influence of a small/large k in the KNN algorithm.

A

Small k:
Good resolution between the boundary of the classes but is highly sensitive to noise.

Large k:
Bad resolution between the boundary of the classes but quite robust to noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe the Distance-Weighted KNN Algorithm.

A

Similar to KNN but now we assign a weight w_r to each neighbour x_r of the instance x_q based on the distance d(x_r, x_q) such that a smaller distance implies a bigger weight and vice versa.

Some measures include the inverse of the distances or the gaussian distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does global and local method mean in the Distance-Weighted KNN Algorithm?

A

If k = n, where n is the total number of previously observed instances, we call the
algorithm a global method. Otherwise, if k < n, the algorithm is called a local method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Give an advantage and disadvantage of Distance-Weighted KNN as well as a remedy for the disadvantage.

A

Advantage – Distance-weighted k-NN algorithm is robust to noisy training data: 

The classification is based on a weighted combination of all k nearest neighbours,
effectively smoothing out the impact of isolated noisy training data.

Disadvantage – All k-NN algorithms calculate the distance between instances based
on all features → if there are many irrelevant features, instances that belong to the
same class may still be distant from one another.

Remedy – weight each feature differently when calculating the distance between two
instances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the Curse of Dimensionality

A

In high-dimensional feature spaces, the nearest neighbours are usually not very near. This is because as we go in higher dimensions, the number of data-points we need to find a correlation or capture the same area/volume increases exponentially. For example we could use 100 points to perform a regression in R^2, but we would need many more to do so in R^10.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How would you use KNN for regression?

A
Instead of using the class in the majority, we
consider the value in the majority.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give the definition of Lazy Learning and Eager Learning, and give an example of an algorithm for each.

A

Lazy Learning:
Stores the data and generalising beyond this data is postponed until an explicit request is made. Example: KNN

Eager Learning:
Constructs a general, explicit description of the target function based on the provided training examples. Example: Neural Networks, Decision Trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give advantages and disadvantages of Lazy Learning and Eager Learning

A

Lazy Learning:
Advantages:
-Very suitable for complex and incomplete problem domains.
-Most useful for large datasets with few attributes.

Disadvantages:

  • Large space requirement to store the entire training dataset
  • Usually long query time.
Eager Learning:
Advantages:
-Better memory efficiency
-Usually low query time.
-Usually deals better with noisy training datasets.

Disadvantages:
-Generally unable to provide good local approximations in the target function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly