machine learning Flashcards by Hien Nga Luong

what is accuracy?

correct prediction/ total

How well did you know this?

Not at all

Perfectly

what is precision?

TP/(TP+FP)

How well did you know this?

Not at all

Perfectly

what is recall?

TP/TP+FP

How well did you know this?

Not at all

Perfectly

what is ROC AUC?

ROC curve: receiver operating characteristic curve - plot of TPR (yaxis) and FPR (xaxis) help decide best threshold
AUC: area under the curve higher better help see which categorisation is better

How well did you know this?

Not at all

Perfectly

what is entropy?

sum(pxlog(1/px))

How well did you know this?

Not at all

Perfectly

What does linear regression solve?

Used for regression problems

How well did you know this?

Not at all

Perfectly

How does linear regression work?

It works by fitting a linear equation to observed data. The steps to perform linear regression are:
- First, the sum of squared residuals is calculated.
- Then, this sum is minimized to find the best fit line.

How well did you know this?

Not at all

Perfectly

what are the parameters of linear regression?

fit_intercept: Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations.
normalize: This parameter is ignored when - - fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm.

How well did you know this?

Not at all

Perfectly

What does Naive Bayes solve?

classification problems

How well did you know this?

Not at all

Perfectly

How does Naive Bayes work?

It works based on Bayes’ theorem with the assumption of independence between every pair of features. Naive Bayes classifiers work well in many real-world situations such as document classification and spam filtering.

How well did you know this?

Not at all

Perfectly

What are the parameters of NB?

priors: Prior probabilities of the classes. If specified the priors are not adjusted according to the data.
var_smoothing: Portion of the largest variance of all features that is added to variances for calculation stability.

How well did you know this?

Not at all

Perfectly

What does SVM solve?

regression and classification problems

How well did you know this?

Not at all

Perfectly

How does SVM work?

It works by finding a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.

How well did you know this?

Not at all

Perfectly

What are the parameters in SVM?

C: Regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. Larger C –> narrower margin as it penalise wrong classification more.
- kernel: Specifies the kernel type to be used in the algorithm. It could be ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable.
- degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
- gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.

How well did you know this?

Not at all

Perfectly

What is Logistic Regression used for?

binary classification problems

How well did you know this?

Not at all

Perfectly

How does Logistic regression work?

Study These Flashcards

It works by using a logistic function to model a binary dependent variable.

What are the parameters of Logistic Regression?

Study These Flashcards

penalty: Used to specify the norm used in the penalization. The ‘newton-cg’, ‘sag’ and ‘lbfgs’ solvers support only l2 penalties. ‘elasticnet’ is only supported by the ‘saga’ solver.
- C: Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.
- fit_intercept: Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

What is K-Means used for?

Study These Flashcards

clustering problems

How does K-Means work?

Study These Flashcards

It works by partitioning n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.

What are the parameters of K-Means?

Study These Flashcards

n_clusters: The number of clusters to form as well as the number of centroids to generate.
- init: Method for initialization, defaults to ‘k-means++’.
- n_init: Number of time the k-means algorithm will be run with different centroid seeds.

What is DBSCAN used for?

Study These Flashcards

clustering problems

What is the mechanism of DBSCAN?

Study These Flashcards

It works by defining a cluster as a maximal set of density-connected points. It discovers clusters of arbitrary shape in spatial databases with noise.

What are the parameters of DBSCAN?

Study These Flashcards

eps: The maximum distance between two samples for one to be considered as in the neighborhood of the other.
- min_samples: The number of samples (or total weight) in a neighborhood for a point to be considered as a core point.

What is baggin and boosting used for?

Study These Flashcards

both regression and classification problems

What is mechanism of bagging and boosting?

Bagging works by creating subsets of the original dataset, fitting a model to each subset, and then combining the predictions. Boosting works by fitting a model to the data, then fitting additional models to the residuals of the initial model, and then combining the predictions.

What is random forest used for?

Used for both regression and classification problems.

How does Random Forest work?

It works by creating a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.

What are parameters of random forest?

- `n_estimators`: The number of trees in the forest. - `max_features`: The number of features to consider when looking for the best split. - `max_depth`: The maximum depth of the tree.

What is decision trees used for?

Regresssion and classification problems

how does decision trees work?

Decision trees split the data into multiple sets. These splits are based on certain conditions, which is why decision trees are also known as Classification and Regression Trees (CART). This process continues on each derived subset in a recursive manner called recursive partitioning.

What are the parameters of decision trees?

max_depth: The maximum depth of the tree. This parameter controls over-fitting. Too high value can lead to overfitting and too low value can lead to underfitting. min_samples_split: The minimum number of samples required to split an internal node. This parameter prevents overfitting. Higher values prevent a model from learning relations which might be highly specific to the particular sample selected for a tree. min_samples_leaf: The minimum number of samples required to be at a leaf node. This parameter also prevents overfitting similar to min_samples_split.

What is Support vector regression (SVR) used for?

regression problems.

How does SVR work?

In SVR, we identify a hyperplane with maximum margin such that the maximum number of data points are within that margin. SVR tries to minimize the error within a certain threshold.

What are the parameters of SVR?

C: The Regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. kernel: Specifies the kernel type to be used in the algorithm. It could be ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. degree: Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels. gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.

machine learning Flashcards

(35 cards)