Machine Learning fundamental principles Flashcards
four features and capabilities of ML
Automatized ML
Azure ML Designer
Data and compute management
Pipelines
main ML groups of algorythm
Supervised (regression, classification)
Unsupervised (clustering)
Reinforced
Regression model evaluation metrics
Mean Absolute Error (MAE) Root Mean Squared Error (RMSE) Relative Squared Error (RSE) Relative Absolute Error (RAE) Coefficient of Determination (R2)
Clustering model evaluation metrics
Average distance to other center Average distance to Cluster center Number of points Maximal distance to Cluster center Combined evaluation
Classification model evaluation metrics
Accuracy Precision Recall (True positive rate) F1 Score (Mix of Precision and Recall) Fall-out (False negative rate)
Classification model, Performance assessment
Confusion Matrix (True Positive, False Positive, True Negative, False Negative)
Regression ML Algorythm
Linear Regression
Decision Forrest Regression
Classification ML Algorythm
Two-Class logistic regression Multiclass logistic regression Two-class neural network
Clustering ML Algorythm
K-means clustering
ML Core tasks
- Data Ingestion
- Data preparation and Data transformation
- Feature selection and engineering
- Model training
- Evaluation (score, test)
- Model Deployment
- Model Management
Azure ML Studio options (coding)
Automated ML (no-code)
Azure ML Designer (low-code)
Notebooks (code)
Task to deploy a model after training
Test the service
Create and test Inference pipeline
Create inference cluster
Deploy inference pipeline
Precision metric value calculation
How many cases predicted right
TP/(TP+FP)
out of all the patients that the model predicted as having diabetes, how many are actually diabetic?
Recall metric value calculation
TP/(TP+FN)
out of all the patients who actually have diabetes, how many did the model identify?
Accuracy metric value calculation
(TP+TN)/(TP+TN+FP+FN)
what proportion of diabetes predictions did the model get right