General Flashcards

1
Q

Stochastic Gradient Descent

A

Gradient descent algorithm that increments the parameters using only single observations at a time.

More efficient than batch gradient descent, especially with large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Batch Gradient Descent

A

Gradient descent algorithm in which it is required to scan through the entire training set before taking a single step.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Localized Linear Regression

A

A variant of traditional linear regression that uses only local data points around Xi to predict Yi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Type I Error (False Positive)

A

Incorrectly rejecting the null hypothesis in favor of the alternative hypothesis when the null is true.

Same as alpha, set at the beginning of the experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Type II Error (False Negative)

A

Failing to reject the null hypothesis when it is false

Also known as beta. Note that power is (1 - beta)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A\B Testing

A

Processing of testing two groups against a given desired measured to determine if there is a statistical difference

General Strategy:

  1. Identify comparative statistic
  2. Determine sample size
  3. Analyze results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SVMs : General description

A

Simple: Machine learning model that uses a hyperplane to differentiate and classify different groups of data

Detailed: SVM identified an appropriate hyperplane by attempting to maximize the margins between points between the closest points of each class to boundary.

If data that cannot be separated linearly, use transformations to map data into high dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SVMs: Soft Margin Classification

A

A mechanism that serves to reduce the overfitting of maximum margin classification by penalizing misclassifications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Bias / Variance Tradeoff

A

A tradeoff in machine learning models where you have the choice of reducing bias (how well a model fits a specific set of data) vs. reducing variance (how much performance of a mode varies across many datasets).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Precision

A

TP / (TP + FP)

Measures the accuracy of positive predictions (but not necessarily identifying all of them).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Recall

A

TP / (TP + FN)

Measures completeness of positive predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

F1 Score

A

2 / ((1/Precision) + 1/Recall))

Harmonic mean of precision and recall, ranging between 0 and 100%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

ROC Curve

A

Plots true positive rate (recall) against false positive rate. A good ROC curve goes toward the top left of the chart.

X = FPR
Y = TPR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Specificity

A

TNR

Ratio of negative instances that are correctly classified as negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Lasso Regression

A

Check this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Elastic Net

A

Check this

17
Q

Early Stopping

A

A way of regularizing a model by stopping training once validation error reaches a minimum

18
Q

Soft max Regression

A

Also known as multinomial logistic regression.

Classification with multiple classes. For each instance x, assigns a score s(x) for each class k, then estimates probability by applying a soft max function.

Soft max function is as followed:

Exp(s(x)) / summation(exp(s(x)))

19
Q

Cross-entropy

A

Loss function used to measure difference between predicted and true probability distributions. Penalizes low probability on true labels significantly.

-1/m ∑ ∑y log (p(k))

Essentially the mean of the -log(estimated probabilities).