Basic concepts in Machine Learning Flashcards

Question 1

Q

Explain the three types of machine learning

Answer

A

Supervised: use labelled data to predict outputs from inputs.
Unsupervised: learn structure from unlabelled data.
Reinforcement: software taking actions to mazimise cumulative reward.

Question 2

Q

Explain what regression and classification are and give examples of each

Answer

A

Regression: learning function mapping inputs to IR. E.g. predict heights of things, house prices etc…
Classification: learning function mapping inputs to discrete outputs (membership to a class). e.g. predict dog vs cat, digit recognition.

Question 3

Q

What are the main challenges facing machine learning?

Answer

A

Low quality and quantity of data
Non-stationary data
overfitting/underfitting

Question 4

Q

Define overfitting and underfitting

Answer

A

Overfitting is learning the training dataset too well so that the model fails to generalise.
Underfitting is too general a prediction which doesn’t capture the dependencies of the data.

Question 5

Q

What are the input space, outcome space and action space in statistical learning?

Answer

A

Input space: set of possible inputs, dimensionality = number of features.
Outcome space: where the outcome labels come from: IR or {0, 1} etc.
Actio n space: space of predictions. Not always outcome space e.g. predicting a probability of membership to class.

Question 6

Q

What is a loss function?

Answer

A

L : YxA -> IR is a function which should be stationary when the prediction is equal to the intended outcome (ideally minimum). It is used to penalise poor predictions.

Question 7

Q

Give two examples of loss functions for regression and one for classification

Answer

A

SE loss : (y-yhat)**2
AE loss : |y-yhat|
logloss : -(ylog(y\at) + (1-y)log(1-yhat))

Question 8

Q

How do SEL and AEL hold up when it comes to outliers in the data?

Answer

A

AE is less sensitive to outliers, i.e. penalises mistakes less.

Question 9

Q

Define the risk functional for a given loss function

Answer

A

The expected loss when using f as a prediction function.

R = E[L(Y, f(X))]

Question 10

Q

Define the Bayes’ prediction function

Answer

A

The Bayes is the function that extremises the risk functional

Question 11

Q

Can we usually find this?

Answer

A

No, this is not what ML models are.

Question 12

Q

Show that the Bayes’ prediction function for SEL is the mean.

Answer

A

See notes!

Question 13

Q

What is the Bayes’ prediction function for AEL?

Answer

A

The median (See notes!)

Question 14

Q

Define the empirical risk functional and the empirical risk minimiser

Answer

A

The empirical risk functional is the average loss over the training data. The minimiser is the function that minimises this functional.

Question 15

Q

What can we do to avoid overfitting?

Answer

A

We could constrain our hypothesis space: i.e. try to extremise the ER subject to being in some constrained function space.

Question 16

Q

How can you ensure the empirical risk converges to the true risk with the number of data points.

Answer

Study These Flashcards

A

Evaluate the risk on an independent test set.

Question 17

Q

Define the constrained risk minimiser

Answer

Study These Flashcards

A

The function within the constrained hypothesis space which minimizes the risk functional

Question 18

Q

Define the constrained empirical risk minimiser

Answer

Study These Flashcards

A

’’ empirical risk functional

Question 19

Q

Define the excess risk. How does this decompose?

Answer

Study These Flashcards

A

ER = R[f_constrained] - R[f_bayes].

It decomposes into an estimation error and approximation error.

Question 20

Q

Explain the trade-off that occurs between these components as we increase the “size” of our hypothesis space.

Answer

Study These Flashcards

A

Increasing the “size” of H decreases the approximation error since the CRM is closer to the Bayes, but increases the estimation error since our prediction function is less likely to be optimal.

Question 21

Q

For what loss function are bias and variance defined?

Answer

Study These Flashcards

A

SE loss

Question 22

Q

What do the bias and variance each show?

Answer

Study These Flashcards

A

bias gives a measure of average difference between the prediction function and the Bayes.
variance gives a measure of sensitivity to the training set.

Basic concepts in Machine Learning Flashcards

(22 cards)