Topic 6: Machine Learning: Performance Evaluation, Backtesting & False Discoveries Flashcards
Describe a ranking classifier
a classifier that gives scores to instances (classifier + threshold = single confusion matrix)
Describe the ROC graph
Two-dimensional plot with false positives rate on the x-axis and true positive rates on the y-axis
Describe the ROC graph.
Y-axis shows the true positive rate (sensitivity) and the X-axis shows the false positive rate (1-specificity)
sensitivity = TP / (TP+FN)
specificity = FP / (FP+TN)
Describe the four corners and the diagonal of the ROC graph.
Bottom left: Conservative (only make classifications with strong evidance)
Upper right: permissive (make positive classifications with weak evidence)
Define the hit rate and false alarm rate.
hit rate = percentage of positives correctly classified (TP/(TP+FP))
false alarm rate = FP/(FP+TN)
Define the AUC measure.
Area under the curve is used to assess the performance of the detection of a model independent of the detection threshold.
Describe the cumulative response curve, also known as the lift curve.
Lift curve plots the hit rate as a function of the population that is targeted.
(e.g. 20% test instances targets 60% of positives targeted)
Describe why standard statistical tools, such as p-values and t-statistics,
can lead to false discoveries in the presence of multiple tests.
the large number of tests will lead to false positives/false negatives so you need a tougher standard.
Calculate the t-statistic based on the reported Sharpe ratio for testing a
single trading strategy.
T-statistic = Sharpe Ratio × √Number of years
Describe and apply Bonferroni tests in the context of the family-wise error rate
(FWER) approach to adjusting p-values for multiple tests.
Approaches to the multiple testing problem in statistics:
Bonferroni test, an FWER, accepts no false discoveries. Calculated by 0.05/number of tests.
Recognize and apply the Holm function to calculate adjusted p-values
Holm pk = 0.05 / (total number of tests + 1 - k), compare p-value with their hurdles
Describe the Holm method in the context of the false discovery rate (FDR) approach to adjusting p-values for multiple tests.
The holm method is less stringent than the bonferoni method, the false discovery rate (FDR) is less stringent than both of them.
Describe the process of accepting and rejecting tests using the Holm method.
P-value should be less than the Holm statistic
Explain the relationship between avoiding false discoveries and missing
profitable opportunities.
Adjusting the hurdle when performing multiple tests decreases type I errors (false discoveries) but increases type II errors (missing discoveries).
Define specificity and sensitivity.
specificity = TN / (TN + FP)
sensitivity = TP / (TP + FN)