General ML Concepts Flashcards
Gradient descent
Optimisation algorithm which involves moving down along the gradient of the function. We use the cost function to tell us how well the function is performing, and adjust the learning rate to move down the slope.
- If the learning rate is too large, then we may miss the bottom of the graph
- If the learning rate is too small, then it may take too long to reach the optimum position
Regularistion
Where overfitting may exist on the training data, but performs significantly worse on real world (test) data, we may make adjustments to the model to account for this. Examples include Lasso and Ridge regression
Hyperparameters (types)
Parameters we set on a model as it is training:
- Hyperparameters: external parameters set at initiation (e.g. learning rate, epochs, batch size etc.)
- Parameters: set internally in the model as it trains (e.g. coefficients)
Cross-validation
Splitting into training and testing data
We often split the training data again into ‘k’ segments and iterate through the model using one of the chunks as the validation set each time (i.e. k-fold cross-validation)