Your Deep Learning Journey Flashcards
Label
The data that we’re trying to predict, such as “dog” or “cat”
Architecture
The template of the model that we’re trying to fit; i.e., the actual mathematical function that we’re passing the input data and parameters to
Model
The combination of the architecture with a particular set of parameters
Parameters
The values in the model that change what task it can do and that are updated through model training
Fit / Train
(1) Update the parameters of the model such that the predictions of the model using the input data (2) match the target labels
Pretrained model
A model that has already been trained, generally using a large dataset, and will be fine-tuned
Fine-tune
Update a pretrained model for a different task
Epoch
One complete pass through the input data; the model has seen every item in the training set.
Loss
A measure of how good the model is, chosen to drive training via SGD (Stochastic Gradient Descent)
Metric
A measurement of how good the model is using the validation set, chosen for human consumption
Validation set
A set of data held out from training, used only for measuring how good the model is
Training set
The data used for fitting the model; does not include any data from the validation set
Overfitting
Training a model in such a way that it remembers specific features of the input data, rather than generalizing well to data not seen during training
CNN
Convolutional neural network; a type of neural network that works particularly well for computer vision tasks
Deep learning is…
…a specialty within machine learning that uses neural networks with multiple layers.
Machine learning is…
…a discipline in which we define a program not by writing it entirely ourselves, but by learning from data.
What is distinctive about Deep Learning architectures?
They are based on neural networks. (e.g.: CNN, RNN, Transformers)
What is segmentation?
At its core, segmentation is a pixelwise classification problem. We attempt to predict a label for every single pixel in the image. This provides a mask for which parts of the image correspond to the given label.
Tabular Data
Data that is in the form of a table, such as from a spreadsheet, data‐base, or a comma-separated values (CSV) file.
Categorical Value
contain values that are one of a discrete set of choices, such as occupation
Continuous Value
contain a number that represents a quantity, such as age
What are “hyperparameters”?
choices regarding network architecture, learning rates, data augmentation strategies, and other factors […] Training models require various other parameters that define how the model is trained. For example, we need to define how long we train for, or what learning rate (how fast the model parameters are allowed to change) is used. These sorts of parameters are hyperparameters.
What was the name of the first device that was based on the principle of the artificial neuron?
Mark I perceptron built by Frank Rosenblatt
Why is it hard to understand why a deep learning model makes a particular prediction?
Deep neural networks have thousands of layers. It is hard to determine which factors are important in determining the final output. The neurons in the network interact with each other. All of this makes it very difficult to understand why a neural network makes a given prediction.
What is the name of the theorem that a neural network can solve any mathematical problem to any level of accuracy?
The universal approximation theorem states that neural networks can theoretically represent any mathematical function. However, it is important to realize that practically, due to the limits of available data and computer hardware, it is impossible to practically train a model to do so. But we can get very close!
What do you need in order to train a model?
You will need (1) an architecture for the given problem. You will need (2) data to input to your model. For most use-cases of deep learning, you will need (3) labels for your data to compare your model predictions to. You will need (4) a loss function that will quantitatively measure the performance of your model. And you need a way to update the parameters of the model in order to improve its performance (this is known as an [5] optimizer).
How could a feedback loop impact the rollout of a predictive policing model?
If a model is retrained based on the policing decisions taken based on the previous model, this action will only add bias to the model.
Do we always have to use 224x224 pixel images with the cat recognition model?
No we do not. 224x224 is commonly used for historical reasons. You can increase the size and get better performance, but at the price of speed and memory consumption.
What is a validation set? What is a test set? Why do we need them?
Validation Set: Used to ‘validate’ the model’s prediction, it uses data the model didn’t use to train itself; it prevents overfitting.
Test Set: Tests the model with data neither the model nor the modeller had seen before; the modeller can adjust hyperparameters to make the validation set ‘fit’. That’s why it’s important to have an unseen test set.
What is a metric? How does it differ to “loss”?
A metric is a function that measures quality of the model’s predictions using the validation set. This is similar to the loss , which is also a measure of performance of the model. However, loss is meant for the optimization algorithm (like SGD) to efficiently update the model parameters, while metrics are human-interpretable measures of performance. Sometimes, a metric may also be a good choice for the loss.