Developing Machine Learning Solutions Flashcards
End to end machine learning lifecycle process
Business goal identification
ML problem framing Data processing (data collection, data preprocessing, and feature engineering) Model development (training, tuning, and evaluation) Model deployment (inference and prediction) Model monitoring Model retraining
Bias
The gap between your predicted value and the actual value
Variance
Dispersion of your predicted values
Classification metrics
1) Accuracy 2) Precision 3) Recall 4) F1 4) AUC-ROC
Regression metrics
1) Mean squared error
2) R squared
Confusion matrix
For a classification model. A confusion matrix can help classify why and how a model gets something wrong.
Model Accuracy or Score Equation
Correct predictions/Total number of predictions
Precision equation
Positive predictions that are correct/Total number of positive predictions. Great metric when the cost of false positives is high
Recall (sensitivity) equation
True positive count/sum of true positive and false negatives
AUC-ROC
For classification models. show what the curve for true positive compared to false positive looks like at various thresholds. AUC-ROC uses sensitivity (true positive rate) and specificity (false positive rate)
What is feature engineering?
Feature engineering transforms data into features or inputs that will be valuable for the model.
Machine learning solution without code
SageMaker Canvas
What is MLOps
MLOps is a set of practices and principles that aims to manage the entire lifecycle of machine learning systems, from model development and training to deployment, monitoring, and maintenance. It provides a structured approach to streamlining the ML workflow, ensuring reliability, scalability, and reproducibility.
What is the machine learning lifecycle?
the end-to-end process of developing, deploying, and maintaining machine learning models from identifying the business problem to the deploying and monitoring the model.