Lec1 - Introduction Flashcards
What are the three main types of Machine Learning?
Supervised Learning
Unsupervised Learning
Reinforcement Learning
What are the three types of labels in Supervised Learning?
Categorical
Real-valued Scalar (Regression)
Ordinal Regression
What are the two ways of performing Unsupervised Learning?
Dimensionality Reduction
Clustering
Name the main problems in Machine Learning.
Classification Regression Clustering Dimensionality Reduction Density Reduction Policy Search
Describe the K-Nearest Neighbours Algorithm.
Given a new data point, we consider the k nearest neighbours (in feature space) and assign the class in the majority (eg if k = 5 and we have 3 elems of class 2 and 2 of class 1, this instance is labelled class 2)
The distance metric used is usually the L2 norm (Euclidian distance).
Give the formula for Euclidian distance of an instance point x_q and training data points x_i.
d(x_i, x_q) = sqrt(Sum_g (a_g(x_i) - a_g(x_q)) ^ 2 )
Give the names of three different distance metrics.
Manhattan (L1 Norm)
Euclidian (L2 Norm)
Chebyshev (L-Infinity Norm)
Describe the influence of a small/large k in the KNN algorithm.
Small k:
Good resolution between the boundary of the classes but is highly sensitive to noise.
Large k:
Bad resolution between the boundary of the classes but quite robust to noise.
Describe the Distance-Weighted KNN Algorithm.
Similar to KNN but now we assign a weight w_r to each neighbour x_r of the instance x_q based on the distance d(x_r, x_q) such that a smaller distance implies a bigger weight and vice versa.
Some measures include the inverse of the distances or the gaussian distribution.
What does global and local method mean in the Distance-Weighted KNN Algorithm?
If k = n, where n is the total number of previously observed instances, we call the
algorithm a global method. Otherwise, if k < n, the algorithm is called a local method.
Give an advantage and disadvantage of Distance-Weighted KNN as well as a remedy for the disadvantage.
Advantage – Distance-weighted k-NN algorithm is robust to noisy training data:
The classification is based on a weighted combination of all k nearest neighbours,
effectively smoothing out the impact of isolated noisy training data.
Disadvantage – All k-NN algorithms calculate the distance between instances based on all features → if there are many irrelevant features, instances that belong to the same class may still be distant from one another.
Remedy – weight each feature differently when calculating the distance between two
instances
Describe the Curse of Dimensionality
In high-dimensional feature spaces, the nearest neighbours are usually not very near. This is because as we go in higher dimensions, the number of data-points we need to find a correlation or capture the same area/volume increases exponentially. For example we could use 100 points to perform a regression in R^2, but we would need many more to do so in R^10.
How would you use KNN for regression?
Instead of using the class in the majority, we consider the value in the majority.
Give the definition of Lazy Learning and Eager Learning, and give an example of an algorithm for each.
Lazy Learning:
Stores the data and generalising beyond this data is postponed until an explicit request is made. Example: KNN
Eager Learning:
Constructs a general, explicit description of the target function based on the provided training examples. Example: Neural Networks, Decision Trees
Give advantages and disadvantages of Lazy Learning and Eager Learning
Lazy Learning:
Advantages:
-Very suitable for complex and incomplete problem domains.
-Most useful for large datasets with few attributes.
Disadvantages:
- Large space requirement to store the entire training dataset
- Usually long query time.
Eager Learning: Advantages: -Better memory efficiency -Usually low query time. -Usually deals better with noisy training datasets.
Disadvantages:
-Generally unable to provide good local approximations in the target function.