General ML Knowledge Flashcards
Linear Model
A model that is specified by a linear combination of features. In other words, the outcome is dependent on the sum of inputs (predictors) and parameters (Betas)
Maximum Likelihood Estimation
A method of estimating the parameters of a model such that they maximize the likelihood that it produced that result.
F1 Score
Harmonic Mean of precision and recall
Macro Average
Simple average of accuracy/precision/recall or roc/auc
Micro Average
Weighted average of accuracy/precision/recall or roc/auc
Precision Formula
TP/(TP+FP)
Recall
TP/(TP+FN)
Bias
Refers to model accuracy – given a single input, how off it is from true prediction
Variance
Refers to model robustness – given single input what is the range of outputs that the model could predict.
Boosting
Sequential Learning where a series of weak learners are put together to make a strong learner.
Sampling without replacement. Any mis-classified data flows down to subsequent learners to be reclassified with additional weights.
Can be prone to overfitting
Bagging
Learning through sampling with replacement.
Underfitting
High bias low variance. The model predicts consistently but incorrectly
Overfitting
Low bias high variance – The model is very sensitive to noise in data and as such predicts inconsistently.
Generative Models
P(X|Y) * P(Y) – models that learn the joint probability distribution.
eg. Naive Bayes
as opposed to Discriminative models that look at P(Y|X)
Discriminative Models
Models that look at P(Y|X) – model directly from trianing data that looks at the most likely class. Logistic regression is an example of this.