Computational Statistic Flashcards
Validation set method
Split Data into training and test set, fitting the model on the training set and calculating MSE on the validation set.
Haldout Method
Perform validation set method several times and choose the model with the best validation error
Validation error
The perdiction error calculated on a test set
Validation set disadvantes
- Validation set unriable without much data
- Validation error highly depended on initial randomness of validation sample
LOOCV
Leave one out Cross Validation
Leave one out Cross Validation
- Train model n times each point being left out once
- Calculate each models test error on the left out point
- Report mean error
Validation set method cost
Cheap
LOOCV cost
Expensive
K-Fold Cross validation
- divide data into k datasets, for each leaving out a small part as validation set
- train model on eack of the k training sets and measure error on validation set
- report average mse
Bias of K-Fold validation error
- validation error of K-Fold is too optimistic (because the model with the best error is selected)
Nested K-Fold Validation
Select model with K-Fold and report error of selected model on test set
Temporal Data
Be carefull not to include data from any point leter than what the model should predict
Sub selection
Try different subsets of features and seöect the subset with the best validation error
Feature
Input variables
Dimensional Reduction
transform features into smaller feature spaces
Regularization
Add punishment term for large coefficients
Target variable
Y variable the model should predict
(x1, y1), (X2, y2),…,(Xn, yn)
data points
x1, x2, … ,xn
feature vector
y1, y2, …, yn
target variable: output of the model
Hyperplane
In a p-dimensional space, a hyperplane is a flat affine subspace of dimension p-1
Seperating Hyperplanes
- A hyperplane is used to divide the feature space into two sides (one for each class)
- predict new point depending on which side of the hyperplane it is
classifier margin
width that the seperating hyperplane could be increased by without hitting a new datapoint
Maximum margin classifier
The seperating hyperplane with the largest possible margin
Soft margin classifier
Allow a budget B of total misclassification to increase the margin of the classifier at the cost of some misclassifications
Support vectors
Points that are “wrong”: Lie in the margin/on the border or on the wrong side of the margin.