Lab 1 Flashcards
Notes that can be extracted from the lab material
Why do we add a column of 1s to the X matrix when computing the Least Squared Estimation?
Adding a column of 1s to the design matrix X:
- Incorporates the bias term into the matrix formulation
- Allows the model to learn a non-zero intercept
- Simplifies the matrix operations by combining all parameters (bias and weights) into a single vector (w’)
What are the positives of using Least Squared Estimation?
Simplicity - Straightforward to understand and implement
Optimality under Certain Conditions - Provides the best unbiased linear estimator, so long as the data satisfies the assumptions of linear regression e.g. linear relationship, Gaussian errors, no multicollinearity
Efficiency - Computationally efficient for small to moderately sized problems
What are the negatives of Least Squared Estimation?
Sensitive to Outliers - Squaring the residuals amplifies the effect of large errors
Poor Performance in higher dimensions - In high dimensions, X^TX can become nearly singular, leading to numerical instability or overfitting
No Feature Selection - Uses all available features without prioritising or penalising irrelevant ones, which can degrade performance
What are the positives of Gradient Descent?
Scalability for large datasets - Can handle very large datasets efficiently, as it doesn’t require loading entire datasets into memory.
Flexibility - Works with a wide variety of loss functions and models, including linear regression, logistic regression and deep learning.
Easy to implement - Relatively simple to code and integrate into most ML pipelines
What are the negatives of Gradient Descent?
Sensitive to Feature Scaling - Features with vastly different scales can cause Gradient Descent to converge slowly or behave unpredictably. Feature normalisation is often required
Can get stuck in Local Minima - In non-convex functions, gradient descent can converge to local minima or saddle points
Lack of interpretability - It optimises the loss function, but doesn’t inherently provide insight into the importance of features or the quality of the model