Losses Flashcards
What is the purpose of MSE in regression?
MSE calculates the average squared difference between predicted and actual values, penalizing larger errors more heavily.
What does Mean Absolute Error (MAE) measure in regression tasks?
MAE measures the average absolute difference between predicted and actual values.
What is Log-Cosh Loss, and why is it used?
Log-Cosh Loss is which behaves as MSE at small values and as MAE at large values, and differentiable everywhere.
What is Binary Cross-Entropy used for?
Binary Cross-Entropy measures the difference between predicted probabilities and true lables (turned into 1 and 0) for binary classification tasks.
What does Categorical Cross-Entropy measure in classification?
Categorical Cross-Entropy is a loss function to evaluate the performance of a model by comparing predicted probabilities with actual one-hot encoded labels.
What is Focal Loss designed to address in classification?
Focal Loss focuses on hard-to-classify examples by reducing the loss contribution from easy examples, helping with class imbalance.
What is the purpose of Triplet Loss in ranking tasks?
Triplet Loss optimizes the relative distances among anchor, positive, and negative samples to improve ranking models.
What is IoU Loss and how is it used in object detection?
IoU Loss measures the intersection-over-union between predicted and ground-truth. It is used in OD for bounding boxes.
What is Smooth L1 Loss, and where is it applied?
Smooth L1 Loss is L1 loss with a quadratic term for small values.
It is used in object detection to be more robust to outliers.
What is the role of Focal Loss in object detection?
Focal Loss helps address class imbalance by focusing training on hard-to-detect objects.
What is Dice Loss?
Dice Loss is 2 times the intersection over the sum.
What does Jaccard Loss measure?
Jaccard Loss, or IoU loss, is the intersection over union of 2 sets.
What is Tversky Loss, and how does it differ from Dice Loss?
Tversky Loss generalizes Dice Loss by adding an option to give different weights to false positives and false negatives, making it suitable for imbalanced data.
What is Adversarial Loss in generative models?
Adversarial Loss is used in GANs to optimize the generator and the discriminator together.
What is Reconstruction Loss in generative models?
Reconstruction Loss measures how well the generated output matches the input, commonly used in Autoencoders.
What is KL Divergence?
KL Divergence is an a-symmetric measure of a difference between two probability distributions.
What are Weighted Losses, and why are they used?
Weighted Losses assign different importance to specific samples or classes, addressing class imbalance or sample-specific priorities.
What is Multi-Task Loss in custom loss functions?
Multi-Task Loss combines losses from multiple objectives, balancing the training of different tasks in a single model.
What is the generator goal in the adversarial loss in GAN
To minimise log(1-D(G(z)))
What is the discriminator goal in the adversarial loss in GAN
To maximise log(D(x)) + log(1-D(G(z)))
Why are we using MSE for loss in reconstruction of images in Auto-Encoder?
Because we assume that the pixel values follow a Gaussian distribution and becacuse we want to maximise the negative log likelihood of the input with the output.
How maximising the negative log likelihood turns into MSE?
Plugging the Gaussian exponent into the log function gives back the squared distance between the output and the label
Which architectures usually use KL divergence loss
VAEs
What is the interpretation of KL divergence?
The expected infomation loss from adopting a new distribution Q over the true distribution P.
The average difference of the number of bits required for encoding samples of P using a code optimised for Q rather than one optimised for P.
What is the main objective of triplet loss in ranking?
To ensure that a relevant item (positive) is ranked higher than an irrelevant item (negative) relative to a query (anchor).
What types of tasks commonly use triplet loss?
Face recognition, dense retrieval and recommender systems.
Why is triplet loss not commonly used in search engine ranking?
Because list-wise and pairwise losses better handle ranking order.
What is the formula for triplet loss?
L = sum(max(0, d(A, P) - d(A, N) + α))