machine learning Flashcards
what is accuracy?
correct prediction/ total
what is precision?
TP/(TP+FP)
what is recall?
TP/TP+FP
what is ROC AUC?
ROC curve: receiver operating characteristic curve - plot of TPR (yaxis) and FPR (xaxis) help decide best threshold
AUC: area under the curve higher better help see which categorisation is better
what is entropy?
sum(pxlog(1/px))
What does linear regression solve?
Used for regression problems
How does linear regression work?
It works by fitting a linear equation to observed data. The steps to perform linear regression are:
- First, the sum of squared residuals is calculated.
- Then, this sum is minimized to find the best fit line.
what are the parameters of linear regression?
-
fit_intercept
: Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations. -
normalize
: This parameter is ignored when - -fit_intercept
is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm.
What does Naive Bayes solve?
classification problems
How does Naive Bayes work?
It works based on Bayes’ theorem with the assumption of independence between every pair of features. Naive Bayes classifiers work well in many real-world situations such as document classification and spam filtering.
What are the parameters of NB?
-
priors
: Prior probabilities of the classes. If specified the priors are not adjusted according to the data. -
var_smoothing
: Portion of the largest variance of all features that is added to variances for calculation stability.
What does SVM solve?
regression and classification problems
How does SVM work?
It works by finding a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.
What are the parameters in SVM?
-
C
: Regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. Larger C –> narrower margin as it penalise wrong classification more.
-kernel
: Specifies the kernel type to be used in the algorithm. It could be ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable.
-degree
: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
-gamma
: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.
What is Logistic Regression used for?
binary classification problems