ML/DL Evaluation Flashcards
Why is it important to evaluate and validate ML/DL models?
To ensure the model meets the desired goals and performs well on unseen data.
What are two common methods for splitting data for validation?
- Holdout strategy
- K-Fold Cross-Validation
What is the holdout strategy in model validation?
The dataset is split into training, validation, and test sets to ensure model generalization.
What is a typical split ratio for holdout validation?
Training: 60%, Validation: 20%, Testing: 20%.
How can we detect overfitting or underfitting?
By comparing training and validation performance:
- Overfitting: High training accuracy but low validation accuracy.
- Underfitting: Poor performance on both training and validation sets.
What are two solutions to overfitting?
- Early stopping
- Regularization (e.g., L2 regularization)
What is K-Fold Cross-Validation?
A technique where the dataset is divided into K subsets, iterating between training and validation to ensure robust evaluation.
Why use K-Fold Cross-Validation?
It reduces bias by ensuring each data point is used for training and validation multiple times.
Why is hyperparameter optimization important?
It helps improve model performance and generalization.
What are two common hyperparameter tuning methods?
- Grid Search
- Random Search
How does Grid Search work?
It exhaustively evaluates all possible hyperparameter combinations.
What are the pros and cons of Grid Search?
- Pros: Thorough and systematic.
- Cons: Computationally expensive for large parameter spaces.
How does Random Search work?
It randomly samples hyperparameters from a specified range.
What are the pros and cons of Random Search?
- Pros: More efficient for high-dimensional spaces.
- Cons: No guarantee of finding the absolute best combination.
What are the two main types of classification?
- Binary classification
- Multi-class classification
What is a confusion matrix?
A table that summarizes the performance of a classification model.
What are the four key components of a confusion matrix?
- True Positives (TP)
- False Negatives (FN)
- False Positives (FP)
- True Negatives (TN)
Why is accuracy not always a reliable metric?
Accuracy can be misleading for imbalanced datasets.
How is precision calculated?
Precision = TP / (TP + FP)
How is recall calculated?
Recall = TP / (TP + FN)
What is the F1-score?
The harmonic mean of precision and recall:
F1 = 2 × (Precision × Recall) / (Precision + Recall)
What is AUC (Area Under the ROC Curve)?
It measures the area under the ROC curve, evaluating classifier performance across all threshold values.
What is the goal of a regression model?
To predict a continuous target variable based on input features.
What are two common regression metrics?
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
How do MAE and MSE differ?
MSE penalizes larger errors more than MAE.
What is the Silhouette Coefficient?
A measure of how similar a data point is to its own cluster compared to other clusters.
When should you use Grid Search?
For small, well-defined hyperparameter spaces.
When should you use Random Search?
For large or continuous hyperparameter spaces.
What is an alternative method for hyperparameter tuning?
Bayesian Optimization for complex problems.