ML-01 - ML-01-Introduction and linear regression Flashcards

1
Q

ML-01 - Introduction and linear regression

When did Arthur Samuel come up with his definition of machine learning?

A

The 1950s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ML-01 - Introduction and linear regression

What as Arthur Samuel’s definition of machine learning?

A

“[…] the field of study that gives computers the ability to learn without being explicitly learned/programmed.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ML-01 - Introduction and linear regression

How did Tom Mitchell define machine learning?

A

Machine learning is a field of study which enables a computer program
learn from experience 𝑬 with respect to some task 𝑻 in a well-posed problem
and some performance measure 𝑷, and improves the performance 𝑷 with experience 𝑬.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ML-01 - Introduction and linear regression

What are the 3 broad types of machine learning?

A
  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ML-01 - Introduction and linear regression

What are the two big types of supervised learning?

A
  • Classification
  • Regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ML-01 - Introduction and linear regression

What are the two big types of unsupervised learning?

A
  • Clustering
  • Dimensionality reduction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ML-01 - Introduction and linear regression

Describe the difference between regression and classification.

A

Regression predicts continuous values, while classification predicts discrete categories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ML-01 - Introduction and linear regression

What is Semi-supervised learning?

A

A type of ML approach where you have some labeled data, but lots of unlabeled data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ML-01 - Introduction and linear regression

What is reinforcement learning?

A

Learning by interacting with the environment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ML-01 - Introduction and linear regression

What are the 5 steps for a supervised learning workflow?

A

1) Get data
2) Clean, prepare, manipulate
3) Train the model
4) Test data
5) Improve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ML-01 - Introduction and linear regression

What are the two most common optimization methods?

A
  • Iterative methods, like gradient descent.
  • Non-iterative methods, like the least squares method.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ML-01 - Introduction and linear regression

Describe gradient descent.

A

Gradient descent works by following the gradient of a function to reach a minimum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

ML-01 - Introduction and linear regression

What is the formula for gradient descent?

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ML-01 - Introduction and linear regression

What are the 3 typical variants of gradient descent?

A
  • (Batch) gradient descent
  • Mini-batch gradient descent
  • Stochastic gradient descent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

ML-01 - Introduction and linear regression

Describe (batch) gradient descent.

A

use the entire training samples in each iteration (called epoch) of gradient descent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ML-01 - Introduction and linear regression

Describe mini-batch gradient descent.

A

Instead of learning on all data per epoch, learning happens on subsets of the data. During each epoch, N / batch_size samples are selected without replacement. For each batch, the model is updated.

17
Q

ML-01 - Introduction and linear regression

Describe stochastic gradient descent.

A

During each epochs, set the batch size to 1 and update on each training example.

18
Q

ML-01 - Introduction and linear regression

How do you make sure you set the learning rate correctly?

A

Plot loss vs. the number of epochs and make sure the loss converges after some number of iterations.

19
Q

ML-01 - Introduction and linear regression

What happens if the learning rate is too high?

A

Loss might not decrease on every iteration and the training won’t converge.

20
Q

ML-01 - Introduction and linear regression

What happens if the learning rate is too low?

A

The learning takes a long time to converge.

21
Q

ML-01 - Introduction and linear regression

Describe the training workflow steps.

A

(See image)

22
Q

ML-01 - Introduction and linear regression

What is feature scaling?

A

A transformation of some data to minimize the effects of different scales.

Learning rates are sensitive to unnormalized data.

E.g. house prices are a lot higher than the number of square meters, and the corresponding coefficients might be disproportional.

23
Q

ML-01 - Introduction and linear regression

Describe visually what happens in feature scaling.

A

(See image)

The unnormalized data makes the learning curve jump around and it doesn’t converge nicely.

24
Q

ML-01 - Introduction and linear regression

What are the two most commonly used normalization methods?

A
  • Min-max normalization
  • Standardization (z-score)
25
Q

ML-01 - Introduction and linear regression

What’s the formula for min-max normalization?

A

(See image)

26
Q

ML-01 - Introduction and linear regression

What’s the range of data after applying min-max normalization?

A

Between 0 and 1.

27
Q

ML-01 - Introduction and linear regression

What’s the formula for standardization normalization?

A

It’s not bounded and might have outliers.

28
Q

ML-01 - Introduction and linear regression

What are some reasons to apply min-max normalization rather than standardization?

A
  • Data doesn’t follow a normal distribution
  • Data must be in a specific range
  • Need to keep the original shape of the data, just scaled down
  • Outliers are not a significant concern
29
Q

ML-01 - Introduction and linear regression

What are some reasons to apply standardization rather than min-max normalization rather? (4)

A
  • When the data follows a normal distribution
  • When the data has (extreme) outliers
  • When your algorithm assumes standardized data
  • When the range of your data is unknown or changing over time
30
Q

ML-01 - Introduction and linear regression

What is polynomial regression?

A

Using linear regression (i.e. weights for parameters) where you feature engineer the data through polynomial transformations.

(See image)

31
Q

ML-01 - Introduction and linear regression

What’s the image an example of? (See image)

A

Polynomial regression.

32
Q

ML-01 - Introduction and linear regression

What is the “normal equation”?

A

A non-iterative optimization method for solving the parameters for linear regression directly.

33
Q

ML-01 - Introduction and linear regression

What is the formula for the “normal equation”?

A

(See image)

34
Q

ML-01 - Introduction and linear regression

What’s the solution to the “normal equation”?

A

(See image)

35
Q

ML-01 - Introduction and linear regression

What are the advantages of gradient descent over the normal equation?

A
  • It works well when the number of features is large.
36
Q

ML-01 - Introduction and linear regression

What are the advantages of the normal equation over gradient descent?

A
  • No need to choose learning rate.
  • Doesn’t iterate, solves the inverse (X^T*x)^-1.
37
Q

ML-01 - Introduction and linear regression

What are the disadvantages of gradient descent over the normal equation?

A
  • You need to choose the learning rate.
  • GD might run for many iterations.
38
Q

ML-01 - Introduction and linear regression

What are the disadvantages of the normal equation over gradient descent?

A
  • Can be slow if the number of features is high; the time complexity of calculating matrix inverses is O(n^3).