Terms Flashcards
Supervised Learning
Algorithms are trained using well labeled training data
Methods of solving linear regression
Singular Value Decomposition and QR Decomposition
Difference between stochastic and gradient descent
For gradient descent, use all samples in training set to calculate loss.
In stochastic, use only one or a subset of training sample to calculate loss.
Mean Square Error (MSE) for evaluating regression models
Measures how close a regression line is to a set of data points
Root Mean Squared Error for evaluating regression models
Shows how far predictions fall from measured true values
Bias
Error introduced by approximating the true underlying function, which can be quite complex, by a simpler model
Low Bias
Fewer assumptions are taken to build the target function. So the model will closely match the training dataset
High Bias
More assumptions are taken to build the target function. Model will not match the training dataset closely. So underfitting occurs
Ways to reduce high bias
- Use a more complex model: model too simple
- Increase the number of features: make more complex
- Reduce regularization of the model: since regularization decreases/prevents overfitting which is not what you want here since high bias causes underfitting.
- Increase the size of the training data: provides the model with more examples to learn from the dataset
Mean Absolute Error (MAE) for evaluating regression models
Measures of the average size of the mistakes in a collection of predictions w/out taking their direction into account
Coefficient of determination (R^2)
Measures how well a statistical model predicts an outcome (from 0 to 1)
Variance
It tells us how much a random variable is different from its expected value as you move from one training set to another. Shows how the performance of a model changes when trained on different subsets of training data
What is overfitting?
Increased model complexity and so low bias and high variance.
So model does well on training set but can’t generalize to test set
Underfitting
Simpler model. So high bias and low variance.
Role of training set
Used to fit the model: train the model with data
Role of validation set
Provide unbiased evaluation of a model while fine tuning hyperparameters.
Improves generalization of the model.
Role of test set
Data model has never seen before.
Allows for an unbiased evaluation of the model.
Cross validation
Separate your total training set into subsets: training and validation set. Evaluate and choose hyperparameters.
Do this iteratively, select different training and validation sets to reduce bias that would occur by selecting only one validation set
K-fold cross validation and how big should k be
Cross validation method but dataset is divided into k parts. Each part has one validation set and k-1 training sets.
4 - small datasets
5- large ones
When do we use logistic regression
Binary Classification like a churn model
Still a linear model because our come depends on the sun of the inputs and parameters not the product or quotient
What is a sigmoid function
Activation function that limits output to between 0 and 1
List evaluation metrics for classification methods
Accuracy, Precision, Recall, F1 score, Logistic/Cross Entropy Loss