Basics Flashcards

Question 1

Q

Difference deep Learning and traditional CV approaches

Answer

A

Deep Learning uses an end-to-end approach, which allows the neural network to learn by itself, using the training data to improve the model, while traditional computer vision relies on handcrafted features

Question 2

Q

What the difference between supervised and unsupervised learning?

Answer

A

Supervised learning uses labeled data to train the model, while unsupervised learning uses unlabeled data to find patterns and structures within the data

Question 3

Q

Difference between classification and regression

Answer

A

Classification is used to predict a class or a label for the input data. Regression is used to predict a numeric value for the input data

Question 4

Q

Explain Underfitting and Overfitting. Name the causes.

Answer

A

Underfitting is when the model is not complex enough to capture the distributions of the data. The model has a bad generalization for not fitting the distribution well, i.e predicts very badly the result for the input data. That happens when the model is not complex enough or needs more training.

Overfitting is when the model is too complex and is fitted to well from the training data, making it difficult to predict the results for unseen data. This can be caused by a too complex model or missing of regularization.

Question 5

Q

Why are non-linear activation functions used in NN?

Answer

A

Non-linear activation functions are used in neural networks to introduce non-linearity, enabling the model to learn complex and non-linear relationships in the data

Question 6

Q

2 advantages of ReLu activation functions over Sigmoid and Tahn

Answer

A

1 - not vanishing gradient.
2 - It is computationally effective as it involves simpler mathematical operations than sigmoid and tanh.

Question 7

Q

Write the formula for precision, recall and F1-score and name on task for which those metrics can be used

Answer

A

Precision = True Positives / (True Positives + False Positives)

Recall = True Positives / (True Positives + False Negatives)

F1-score = 2 * (Precision * Recall) / (Precision + Recall)

Those metrics can be used in image classification.

Question 8

Q

Update formula for stochastic gradient descendent

Answer

A

θ <- θ − η⋅∇_θL(θ;x(i);y(i))

Question 9

Q

Formula leaky ReLu. What param alpha is normal?.Alpha = 1, reasonable?

Answer

A

f(x)=max(0.01*x , x).
Normal aplha = 0,01. Alpha=1 is not reasonable, it would become an Identity function.

Question 10

Q

Which loss function for classification commonly used together with Softmax normalization? Name, formula and define variables

Answer

A

Negative Log-Likelihood.
(Look the formula online)
N: Number of samples in the dataset.
y_i : Ground-truth class label for sample i.
p iy i : Predicted probability of the true class for sample

Question 11

Q

What is the triplet loss. Write its formula.

Answer

A

The goal is to minimize the distance between the ancor image to the positive mach, while maximize the distance between the ancor to the negative math

Question 12

Q

What is data augmentation and give 2 examples

Answer

A

Is a technique to use the dataset to transform the images and then have new data.

1 - Horizontal reflexion
2 - Random crops on the image
3 - Scale
4 - flip the image

Question 13

Q

Key idea behind data augmentation?

Answer

A

Increase the database, by applying random transforming to existing data

Question 14

Q

Name a reason why it can make sense to apply image augmentations/transformations at test time

Answer

A

Applying augmentations at test time improves model robustness and generalization to handle diverse real-world scenarios

Question 15

Q

What is dropout and what is it used for? Which hyperparam does a dropout layer have

Answer

A

Dropout is a regularization technique that randomly deactivates neurons during training to prevent overfitting in neural networks.

The hyperparameter of a dropout layer is the dropout rate, representing the probability of setting a neuron’s output to zero during training.

Question 16

Q

How does the dropout rate of 0.5 effect the number of learnable parameters in a layer of NN?

Answer

Study These Flashcards

A

It does not affect the numbers of learnable parameters

Question 17

Q

Batch normalization: Describe briefly, Write formulas for the procedure(2 steps) for one input dimension k. What is the key difference between computing the batch normalization output at training and test-time. Name at least two benefits of batch normalization.

Answer

Study These Flashcards

A

is a technique used to normalize the activations of a layer, making the training process more stable and accelerating convergence.

1 - Normalize
2 - Scale and Shift through learnable 𝛾 and β

The key difference is that during training, it normalizes the activations based on the mean and variance of the current mini-batch, while at test-time, it uses the mean and variance computed during training on the entire dataset to normalize the activations

Avoids overfitting. accelerates the learning process

Basics Flashcards

(17 cards)