Fast AI Flashcards

1
Q

What did Frank Rosenblatt build?

A

perceptron

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What was Marvin Minsky’s innovation?

A

Using multiple layers could solve for limitations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the general method for updating weights in a neural net?

A

Stochastic gradient descent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is ImageNet?

A

The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s a tensor?

A

collections of numbers (vector, matrix, and higher dimensions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain gradient descent

A

At each step we try and work out the gradient and based on updating the parameters to minimise the loss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What function can be used to describe any input-output? In neural nets

A

Combinations of Rectified Linear functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe deep learning mathematically…

A

Combinations of ReLu
Gradient descent
Inputs and outputs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When i adjust parameters (in line with gradient descent), the amount i adjust by is called the what?

A

The learning rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

the foundational mathematical operation in deep learning is …

A

Matrix multiplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe solving the Titanic (kaggle) problem using regression in Excel

A
  • Normalize all the variables
  • Random coefficients
  • Loss for each row, and then the mean of the sum
  • “solve” function to minimimise the Total Loss by changing the coefficients
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the method known where you split data by one variable to determine the outcome?

A

1R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is the 1R made more sophisticated?

A

Turning it into a Decision Tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the best package to use in Python for non-deep (aka classical) machine learning?

A

Scikit Learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s the name of the measure for how good a split is?

A

Gini
(how likely if you pick from a sample and then pick another that you’ll get the same thing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What’s the main way of improving decision trees?

A

Random Forest

17
Q

What’s the idea behind “bagging” (used in Random Forest)?

A

Take average predicition of uncorrelated models - that should reduce the error to zero

18
Q

What’s a good first model to use when given a data set?

A

Random Forest - get to see which columns are the most important

19
Q

What is an Out of Bag Error?

A

Predictions made on the rows not used in the particular decision tree (subset)

20
Q

What is gradient boosting?

A

Boosting is one kind of ensemble Learning method which trains the model sequentially and each new model tries to correct the previous model

21
Q

what is TTA (machine learning)?

A

Test Time Augmentation
(passing along to the model multiple augmented images and getting back the average result)

22
Q

What does Jeremy Howard recommend for GPU?

A

Nvidia
RTX cards are for consumers (NOT for data centers) e.g. 3080 is about $1,000

23
Q

Recommendation systems are known as

A

Collaborative Filtering

24
Q

What’s a convolution?

A

sliding matrix (/kernel) of values from o.g. image dot-product-ed with a filter to pick out features such as vertical edges, horizontal edges etc

25
Q

How is overfitting avoided in CNNs?

A

By introducing a drop out layer - some of the pixels are zeroed (randomly)