Metrics Flashcards
What does R² (Coefficient of Determination) measure in regression?
R² measures how well the predicted values explain the variance in the actual values, indicating the model’s fit.
What is Mean Absolute Error (MAE) used for in regression?
MAE measures the average absolute difference between predicted and actual values, indicating prediction accuracy.
What is Mean Squared Error (MSE), and why is it used?
MSE calculates the average squared differences between predicted and actual values, penalizing larger errors more heavily.
What does Root Mean Squared Error (RMSE) represent in regression?
RMSE is the square root of MSE, providing error measurements in the same units as the target variable.
What is Adjusted R², and how does it differ from R²?
Adjusted R² accounts for the number of predictors in the model, preventing overestimation of model performance.
What does Accuracy measure in classification tasks?
Accuracy is the proportion of correctly classified instances out of the total instances.
What is Precision in classification metrics?
Precision is the ratio of true positive predictions to all positive predictions, reflecting model specificity.
What does Recall indicate in classification?
Recall, or sensitivity, is the ratio of true positive predictions to all actual positives, measuring model completeness and sensitivity.
What is the F1-Score used for in classification?
The F1-Score is the harmonic mean of Precision and Recall, balancing the two in cases of class imbalance.
What does the Area Under ROC Curve (AUC) signify?
AUC measures a classifier’s ability to distinguish between classes across various thresholds, with 1 being perfect.
What is the Area Under Precision-Recall Curve (PR AUC)?
PR AUC evaluates classifier performance for imbalanced datasets by focusing on Precision and Recall trade-offs.
What information does a Confusion Matrix provide?
A Confusion Matrix summarizes classification predictions, showing counts of true positives, false positives, false negatives, and true negatives.
What is Mean Average Precision (MAP) used for in ranking tasks?
MAP evaluates the precision at multiple ranks, averaged across all queries, for ranking relevance.
What does Mean Reciprocal Rank (MRR) measure?
MRR calculates the average of reciprocal ranks for the first relevant item in ranked results, indicating query response quality.
What is the Silhouette Score in clustering?
The Silhouette Score measures how well data points fit within their clusters compared to other clusters, ranging from -1 to 1.
What is the purpose of Macro-Averaged Precision in multi-class classification?
Macro-Averaged Precision computes the unweighted average Precision across all classes, treating each class equally.
What is Micro-Averaged Recall used for in multi-class classification?
Micro-Averaged Recall aggregates true positives and false negatives across the whole dataset. Gives equal weight to each instance.
What does Macro-Averaged F1-Score indicate in multi-class metrics?
Macro-Averaged F1-Score calculates the F1-Score for each class and averages them, providing a balanced view of performance.
What does a Multi-Class Confusion Matrix show?
A Multi-Class Confusion Matrix shows prediction counts for all possible class pairings, highlighting misclassification patterns.
What does Precision@k measure?
The fraction of relevant items in the top k results.
What is the main difference between Precision@k and Recall@k?
Precision@k measures the fraction of relevant items retrieved in the top k, while Recall@k measures the proportion of all relevant items retrieved.
What does Mean Average Precision (mAP) reward in ranking?
Correct order of relevant items across multiple queries.
What does Mean Reciprocal Rank (MRR) focus on?
How early the first relevant item appears in the ranking.
What is the formula for Discounted Cumulative Gain (DCG)?
DCG = sum(rel(i) / log2(i+1))
Why is Normalized DCG (NDCG) useful?
It normalizes DCG by the ideal DCG (IDCG) for comparability.
What does Kendall’s Tau measure in ranking evaluation?
The correlation between two rankings by counting pairwise swaps needed to match the ideal ranking.
What is Spearman’s Rank Correlation used for?
Evaluating ranking similarity using correlation between predicted and ideal rankings.
What does ROC-AUC measure in ranking?
How well the ranking separates relevant and irrelevant items.
When is F1-Score useful in ranking?
When binary relevance labels are used, balancing precision and recall.
When should one choose MAE over MSE?
When there’s a suspicion of outliers in the data.