General Flashcards
Stochastic Gradient Descent
Gradient descent algorithm that increments the parameters using only single observations at a time.
More efficient than batch gradient descent, especially with large datasets.
Batch Gradient Descent
Gradient descent algorithm in which it is required to scan through the entire training set before taking a single step.
Localized Linear Regression
A variant of traditional linear regression that uses only local data points around Xi to predict Yi
Type I Error (False Positive)
Incorrectly rejecting the null hypothesis in favor of the alternative hypothesis when the null is true.
Same as alpha, set at the beginning of the experiment.
Type II Error (False Negative)
Failing to reject the null hypothesis when it is false
Also known as beta. Note that power is (1 - beta)
A\B Testing
Processing of testing two groups against a given desired measured to determine if there is a statistical difference
General Strategy:
- Identify comparative statistic
- Determine sample size
- Analyze results
SVMs : General description
Simple: Machine learning model that uses a hyperplane to differentiate and classify different groups of data
Detailed: SVM identified an appropriate hyperplane by attempting to maximize the margins between points between the closest points of each class to boundary.
If data that cannot be separated linearly, use transformations to map data into high dimensions.
SVMs: Soft Margin Classification
A mechanism that serves to reduce the overfitting of maximum margin classification by penalizing misclassifications
Bias / Variance Tradeoff
A tradeoff in machine learning models where you have the choice of reducing bias (how well a model fits a specific set of data) vs. reducing variance (how much performance of a mode varies across many datasets).
Precision
TP / (TP + FP)
Measures the accuracy of positive predictions (but not necessarily identifying all of them).
Recall
TP / (TP + FN)
Measures completeness of positive predictions
F1 Score
2 / ((1/Precision) + 1/Recall))
Harmonic mean of precision and recall, ranging between 0 and 100%
ROC Curve
Plots true positive rate (recall) against false positive rate. A good ROC curve goes toward the top left of the chart.
X = FPR
Y = TPR
False Positive Rate
(1 - Specificity)
Proportion of negative instances that are incorrectly classified as positive (i.e. false positive)
FP / (FP + TN)
Lasso Regression
Check this