Fundamentals of Machine Learning I (Not Part of Certification) Flashcards
Characteristics of Regression?
- Predicts amount of something based on numerical values
- Uses data splitting and penalty to get an effective mathematical function
- Supervised
- Metrics
- MAE, MSE, RMSE, R^2
What is binary classification + characteristics
- Prediction of one of two classes ex(diabetic to not diabetic)
- Supervised
- Uses probability
- Classifies probability with sigmoid curve
- Logistic Regression is a binary classifier
- made with random subset of data
- Confusion Matrixes
- F1, Recall, Precision, AUC, TPR, FPR, ROC
What is RMSE
RMSE (Root Mean Squared Error) Used to measure number of
incorrect predictions
What is MAE
MAE (Mean Absolute Error) Calculated with mean error
What is F1
F1 Score
- Harmonic mean of Precision and Recall.
- Formula: 2 * (Precision * Recall) / (Precision + Recall).
What is Recall
Recall (Sensitivity, TPR)
- Proportion of actual positives correctly identified by the model.
- Formula: TP / (TP + FN).
What is AUC
AUC (Area Under the Curve)
- Measures overall classification performance as the area under the ROC curve.
- Values range from 0 to 1, with 1 being perfect and 0.5 representing random guessing.
What is TPR
TPR (True Positive Rate, Recall)
- Proportion of actual positives classified as positive.
- Formula: TP / (TP + FN).
What is Precision
Precision
- Proportion of predicted positives that are actual positives.
- Formula: TP / (TP + FP).
What is FPR
FPR (False Positive Rate)
- Proportion of actual negatives incorrectly classified as positive.
- Formula: FP / (FP + TN).
What is ROC
ROC (Receiver Operating Characteristic) Curve
- Plots TPR vs. FPR across different thresholds.
- Used to assess model performance over varying decision boundaries.
What is R^2
R^2 (Coefficient of determination) Used to measure variance in
data to calculate the fit of the model
What is MSE
MSE (Mean squared error) Mean of error amount squared. Used
to amplify the error amount
What is multiclass classification?
Multiclass classification is used to predict which of multiple possible classes an observation belongs to. It calculates probability values for each class label and predicts the most probable class.
(Supervised)
What are the two types of algorithms used in multiclass classification?
The two types of algorithms are:
One-vs-Rest (OvR): Trains a binary classification function for each class.
Multinomial: Creates a single function that returns a probability distribution for all possible classes.