lecture 10: performance issues Flashcards
we have learnt the building blocks of a learning algorithm, what is the next missing step?
validating the accuracy of the algorithm in terms of predicting novel data based on the limited data we have.
to do this, we can use a train, validation and test split of the data, and then cross validate
what are the common partitioning folds for TVT splitting
2,4,5 and 10 fold CV
what is the purpose of splitting data three ways?
the test set now contains data that the algorithm has never seen before, so if the model performs well on predicting the test set, we can say that the model generalises well
what are the 2 most common evaluation metrics used to measure performance of regression learning algorithms?
mean square error and mean absolute error
what are the evaluation metrics for classification learning algorithms
confusion matrix for binary classification, cost matrix for binary classification, decision error trade off, area under curve for receiver operating characteristic curve, gini coefficient
what are some of the trade-offs involved with achieving high software quality
computational efficiency, maintainability, flexibility, extensibility, usability etc.