Losses Flashcards by Alon Samuel

What is the purpose of MSE in regression?

MSE calculates the average squared difference between predicted and actual values, penalizing larger errors more heavily.

How well did you know this?

Not at all

Perfectly

What does Mean Absolute Error (MAE) measure in regression tasks?

MAE measures the average absolute difference between predicted and actual values.

How well did you know this?

Not at all

Perfectly

What is Log-Cosh Loss, and why is it used?

Log-Cosh Loss is which behaves as MSE at small values and as MAE at large values, and differentiable everywhere.

How well did you know this?

Not at all

Perfectly

What is Binary Cross-Entropy used for?

Binary Cross-Entropy measures the difference between predicted probabilities and true lables (turned into 1 and 0) for binary classification tasks.

How well did you know this?

Not at all

Perfectly

What does Categorical Cross-Entropy measure in classification?

Categorical Cross-Entropy is a loss function to evaluate the performance of a model by comparing predicted probabilities with actual one-hot encoded labels.

How well did you know this?

Not at all

Perfectly

What is Focal Loss designed to address in classification?

Focal Loss focuses on hard-to-classify examples by reducing the loss contribution from easy examples, helping with class imbalance.

How well did you know this?

Not at all

Perfectly

What is the purpose of Triplet Loss in ranking tasks?

Triplet Loss optimizes the relative distances among anchor, positive, and negative samples to improve ranking models.

How well did you know this?

Not at all

Perfectly

What is IoU Loss and how is it used in object detection?

IoU Loss measures the intersection-over-union between predicted and ground-truth. It is used in OD for bounding boxes.

How well did you know this?

Not at all

Perfectly

What is Smooth L1 Loss, and where is it applied?

Smooth L1 Loss is L1 loss with a quadratic term for small values.
It is used in object detection to be more robust to outliers.

How well did you know this?

Not at all

Perfectly

What is the role of Focal Loss in object detection?

Focal Loss helps address class imbalance by focusing training on hard-to-detect objects.

How well did you know this?

Not at all

Perfectly

What is Dice Loss?

Dice Loss is 2 times the intersection over the sum.

How well did you know this?

Not at all

Perfectly

What does Jaccard Loss measure?

Jaccard Loss, or IoU loss, is the intersection over union of 2 sets.

How well did you know this?

Not at all

Perfectly

What is Tversky Loss, and how does it differ from Dice Loss?

Tversky Loss generalizes Dice Loss by adding an option to give different weights to false positives and false negatives, making it suitable for imbalanced data.

How well did you know this?

Not at all

Perfectly

What is Adversarial Loss in generative models?

Adversarial Loss is used in GANs to optimize the generator and the discriminator together.

How well did you know this?

Not at all

Perfectly

What is Reconstruction Loss in generative models?

Reconstruction Loss measures how well the generated output matches the input, commonly used in Autoencoders.

How well did you know this?

Not at all

Perfectly

What is KL Divergence?

KL Divergence is an a-symmetric measure of a difference between two probability distributions.

What are Weighted Losses, and why are they used?

Weighted Losses assign different importance to specific samples or classes, addressing class imbalance or sample-specific priorities.

What is Multi-Task Loss in custom loss functions?

Multi-Task Loss combines losses from multiple objectives, balancing the training of different tasks in a single model.

What is the generator goal in the adversarial loss in GAN

To minimise log(1-D(G(z)))

What is the discriminator goal in the adversarial loss in GAN

To maximise log(D(x)) + log(1-D(G(z)))

Why are we using MSE for loss in reconstruction of images in Auto-Encoder?

Because we assume that the pixel values follow a Gaussian distribution and becacuse we want to maximise the negative log likelihood of the input with the output.

How maximising the negative log likelihood turns into MSE?

Plugging the Gaussian exponent into the log function gives back the squared distance between the output and the label

Which architectures usually use KL divergence loss

VAEs

What is the interpretation of KL divergence?

The expected infomation loss from adopting a new distribution Q over the true distribution P.

The average difference of the number of bits required for encoding samples of P using a code optimised for Q rather than one optimised for P.

What is the main objective of triplet loss in ranking?

To ensure that a relevant item (positive) is ranked higher than an irrelevant item (negative) relative to a query (anchor).

What types of tasks commonly use triplet loss?

Face recognition, dense retrieval and recommender systems.

Why is triplet loss not commonly used in search engine ranking?

Because list-wise and pairwise losses better handle ranking order.

What is the formula for triplet loss?

L = sum(max(0, d(A, P) - d(A, N) + α))