04 Training Models Flashcards
In an equation y = b1x1 + b2x2 +c what does ML calculate
In an equation x1, x2 values are there in the data(columns) and b1 and b2 are calculated by the model
name order of equations in linear regression
Normal Equation followed by linear regression model
Why is normal equation used
to calculate cost function
What is the main component in linear regression model
theta is the main component. we need to find the value of theta where RMSE value is less.
Drawback of normal equation
It is very slow on large dataset with many features.
Methods used to run Linear regression model with large dataset
- Gradient Descent
- Batch Gradient Descent
- Stochastic Gradient Descent
- Mini - Batch Gradient Descent
What is gradient descent
It is an algorithm to find optimal solution to a complex problems.
it measures local gradient of error function to the theta.
what is the relation of learning rate and theta
if the learning rate is too low than it will take more time to reach to the theta value and if the learning rate is too high than it will cross the optimal theta value.
Difference between Gradient Descent and Batch Gradient Descent method
in the Gradient Descent the change in the cost function and parameter is calculated at each step and in the batch Gradient Descent all the change is calculated using the entire training data and in single step.
What is Stochastic Gradient Descent Method
it uses an instance (randomly selected) from the data to calculate the optimal theta value.
Advantage of using Stochastic Gradient Descent Method
1.Very quick on large dataset.
2. effective when the data has multiple local minima’s.
what each iteration in linear regression called
epoch
what is learning schedule
function to determine learning rate.
What is mini batch stochastic Gradient Descent
it is the mixture of stochastic and batch Gradient Descent
Do we require scaling for any of the Gradient Descent method
yes
Performance of Gradient Descent models on large datasets
Normal eq - fast;
BGD - Slow;
SGD - Fast
Mini-batch GD - Fast
Performance of Gradient Descent models with many features
Normal eq - slow;
BGD - Fast;
SGD - Fast
Mini-batch GD - Fast
which logistic regression model have 0 hyper parameter
normal equation(LinearRegression)