General Flashcards

Question 1

Q

Stochastic Gradient Descent

Answer

A

Gradient descent algorithm that increments the parameters using only single observations at a time.

More efficient than batch gradient descent, especially with large datasets.

Question 2

Q

Batch Gradient Descent

Answer

A

Gradient descent algorithm in which it is required to scan through the entire training set before taking a single step.

Question 3

Q

Localized Linear Regression

Answer

A

A variant of traditional linear regression that uses only local data points around Xi to predict Yi

Question 4

Q

Type I Error (False Positive)

Answer

A

Incorrectly rejecting the null hypothesis in favor of the alternative hypothesis when the null is true.

Same as alpha, set at the beginning of the experiment.

Question 5

Q

Type II Error (False Negative)

Answer

A

Failing to reject the null hypothesis when it is false

Also known as beta. Note that power is (1 - beta)

Question 6

Q

A\B Testing

Answer

A

A/B testing, also known as split or bucket testing, is a user experience research method that compares two or more versions of content to determine which one performs best

A/B testing involves randomly assigning visitors to see either a control (A) version or a variant (B) version of a page or content. The performance of each version is then measured based on key metrics, such as the number of conversions or visitors who took the desired action.

Question 7

Q

SVMs : General description

Answer

A

Simple: Machine learning model that uses a hyperplane to differentiate and classify different groups of data

Detailed: SVM identified an appropriate hyperplane by attempting to maximize the margins between points between the closest points of each class to boundary.

If data that cannot be separated linearly, use transformations to map data into high dimensions.

Question 8

Q

SVMs: Soft Margin Classification

Answer

A

A mechanism that serves to reduce the overfitting of maximum margin classification by penalizing misclassifications

Question 9

Q

Bias / Variance Tradeoff

Answer

A

A tradeoff in machine learning models where you have the choice of reducing bias (how well a model fits a specific set of data) vs. reducing variance (how much performance of a mode varies across many datasets).

Question 10

Q

Precision

Answer

A

TP / (TP + FP)

Measures the accuracy of positive predictions (but not necessarily identifying all of them).

Question 11

Q

Recall

Answer

A

TP / (TP + FN)

Measures completeness of positive predictions

Question 12

Q

F1 Score

Answer

A

2 / ((1/Precision) + 1/Recall))

Harmonic mean of precision and recall, ranging between 0 and 100%

Question 13

Q

ROC Curve

Answer

A

Plots true positive rate (recall) against false positive rate. A good ROC curve goes toward the top left of the chart.

X = FPR
Y = TPR

Question 14

Q

False Positive Rate
(1 - Specificity)

Answer

A

Proportion of negative instances that are incorrectly classified as positive (i.e. false positive)

FP / (FP + TN)

Question 15

Q

Lasso Regression

Answer

A

Check this

Question 16

Q

Elastic Net

Answer

A

Check this

Question 17

Q

Early Stopping

Answer

A

A way of regularizing a model by stopping training once validation error reaches a minimum

Question 18

Q

Soft max Regression

Answer

A

Also known as multinomial logistic regression.

Classification with multiple classes. For each instance x, assigns a score s(x) for each class k, then estimates probability by applying a soft max function.

Soft max function is as followed:

Exp(s(x)) / summation(exp(s(x)))

Question 19

Q

Cross-entropy

Answer

A

Loss function used to measure difference between predicted and true probability distributions. Penalizes low probability on true labels significantly.

-1/m ∑ ∑y log (p(k))

Essentially the mean of the -log(estimated probabilities).

Question 20

Q

Accuracy

Answer

A

Number of correctly classified instances / number of all classified instances

Question 21

Q

True Positive Rate
(Sensitivity)

Answer

A

Proportion of positive instances that are correctly classified as positive (i.e. true positive)

TP / (TP + FN)

Question 22

Q

Specificity

Answer

A

True Negative Rate

Proportion of negative instances that are correctly classified as negative (i.e. true negative)

TN / (TN + FP)

Question 23

Q

Gradient Descent

Answer

A

An algorithm that minimizes a particular function (in ML the loss function) by taking small steps in the direction of the steepest descent for that function.

Step 1: Take the derivative of the loss function for each parameter (i.e. take the gradient of the loss function).

Step 2: Initialize parameters with random values

Step 3: Plug parameters into the partial derivatives (gradient)

Step 4: Calculate step sizes (Calculated slope from step 3 * learning rate)

Step 5: Calculate the new parameters (New = Old - Step Size)

Step 6: Repeat 4-5 until convergence

Question 24

Q

Steps for K-fold Cross Validation

Answer

A

Step 1: Shuffle data into equally sized blocks (folds)

Step 2: For each fold k, train model on all data except fold i, and evaluate validation error using the remaining fold i.

Step 3: Average the validation errors from step 2 to get estimate of the true error.

Question 25

Q

Bootstrapping

Answer

A

Drawing observations from a large data sample repeatedly (sampling with replacement) and then estimating some quantity of a population by averaging estimates from multiple smaller samples.

Useful for small data sets and helping to deal with class imbalance.

Question 26

Q

Hyperparameter tuning:
Grid search

Answer

A

Forming a grid that is the Cartesian product of all parameters and then sequentially trying all such combinations and seeing which yields best results.

Question 27

Q

Hyperparameter tuning:
Random Search

Answer

A

Randomly sample from the joint distribution of all parameters.

Question 28

Q

ROC Curve
AUC?

Answer

A

Plots the true positive rate (y) against the false positive rate (x) for various thresholds.

Area under the curve (AUC) measures how well the classifier separates classes.

Question 29

Q

Conditional Probability

P(A/B)

Answer

A

P(A ∩ B) / P(B)

Question 30

Q

Bayes Theorem

Question 31

Q

Matrix

(row x column)
(2 x 3)

Brainscape's Knowledge GenomeTM

General Flashcards

Brainscape's Knowledge Genome^TM