04 Training Models Flashcards by charita rallabhandi

In an equation y = b1x1 + b2x2 +c what does ML calculate

In an equation x1, x2 values are there in the data(columns) and b1 and b2 are calculated by the model

How well did you know this?

Not at all

Perfectly

name order of equations in linear regression

Normal Equation followed by linear regression model

How well did you know this?

Not at all

Perfectly

Why is normal equation used

to calculate cost function

How well did you know this?

Not at all

Perfectly

What is the main component in linear regression model

theta is the main component. we need to find the value of theta where RMSE value is less.

How well did you know this?

Not at all

Perfectly

Drawback of normal equation

It is very slow on large dataset with many features.

How well did you know this?

Not at all

Perfectly

Methods used to run Linear regression model with large dataset

Gradient Descent
Batch Gradient Descent
Stochastic Gradient Descent
Mini - Batch Gradient Descent

How well did you know this?

Not at all

Perfectly

What is gradient descent

It is an algorithm to find optimal solution to a complex problems.
it measures local gradient of error function to the theta.

How well did you know this?

Not at all

Perfectly

what is the relation of learning rate and theta

if the learning rate is too low than it will take more time to reach to the theta value and if the learning rate is too high than it will cross the optimal theta value.

How well did you know this?

Not at all

Perfectly

Difference between Gradient Descent and Batch Gradient Descent method

in the Gradient Descent the change in the cost function and parameter is calculated at each step and in the batch Gradient Descent all the change is calculated using the entire training data and in single step.

How well did you know this?

Not at all

Perfectly

What is Stochastic Gradient Descent Method

it uses an instance (randomly selected) from the data to calculate the optimal theta value.

How well did you know this?

Not at all

Perfectly

Advantage of using Stochastic Gradient Descent Method

1.Very quick on large dataset.
2. effective when the data has multiple local minima’s.

How well did you know this?

Not at all

Perfectly

what each iteration in linear regression called

epoch

How well did you know this?

Not at all

Perfectly

what is learning schedule

function to determine learning rate.

How well did you know this?

Not at all

Perfectly

What is mini batch stochastic Gradient Descent

it is the mixture of stochastic and batch Gradient Descent

How well did you know this?

Not at all

Perfectly

Do we require scaling for any of the Gradient Descent method

yes

How well did you know this?

Not at all

Perfectly

Performance of Gradient Descent models on large datasets

Normal eq - fast;
BGD - Slow;
SGD - Fast
Mini-batch GD - Fast

How well did you know this?

Not at all

Perfectly

Performance of Gradient Descent models with many features

Normal eq - slow;
BGD - Fast;
SGD - Fast
Mini-batch GD - Fast

How well did you know this?

Not at all

Perfectly

which logistic regression model have 0 hyper parameter

normal equation(LinearRegression)

How well did you know this?

Not at all

Perfectly

How can a normal equation be used to solve a polynomial equation?

Study These Flashcards

the feature with x^2 will be added as a new feature.

what is overfitting

Study These Flashcards

the model is performing well training data and is not performing well on the validation set.

what is underfitting?

Study These Flashcards

the model is not performing well on both training and validation set.

how to deal with an underfitting model?

Study These Flashcards

we need to add more features or choose a complex model.

how to deal with an overfitting model?

Study These Flashcards

we need to add more training date.

name 3 types of model errors

Study These Flashcards

1.Bias;
2.Variance;
3.Irreducible error

what is Bias model error and how can we recognize it?

It is due to wrong assumptions, i.e. we think it is normal equation while it is quadratic equation. In this case the model underperformance.

what is Variance model error and how can we recognize it?

It is because the model will be sensitive to even slight change. In this case the model will overfit.

What is regularization?

It is constraining the model. it is a way to control overfitting in a model. It is achieved by applying weights.

what are different types of regularization?

1.Ridge Regression 2.Lasso Regression 3.Elastic Net

What is Ridge Regression?

a regularization term is added to the algorithm forcing the algorithm to fit the data and keep the weights as small as possible.

Is scaling necessary for Ridge Regression?

Yes standardization is necessary.

which hyperparameter in regularization model controls the regularization?

alpha

which method to use for ridge regression?

Ridge()

which method to use for ridge regression in sgd?

SGDRegressor(penalty = "l2")

what is full form of Lasso Regression

Least Absolute Shrinkage and Selection Operator.

how does lasso regression work?

it adds weights to feature, it adds weight 0 for the least important feature.

how is ridge regression different from Lasso regression

ridge regression it adds weight & Lasso regression it adds weight 0 to the least important features.

which method to use for Lasso regression?

Lasso()

which method to use for Lasso regression in sgd?

SGDRegressor(penalty = "l1")

What is Elastic Net?

It is a middle ground between Lasso and Ridge regression

which hyperparameter is used to control the ratio of lasso and ridge regression

l1_ratio - where if it is close to 0 it will be ridge regression and close to 1 will be lasso regression.

Which model among linear regression, ridge & lasso is better

ridge & lasso are better as they have regularization.

if I have a data with high correlated dataset which model to use between Lasso and Ridge.

Ridge is better to start with high correlated dataset.

What is Early Stopping?

Another way to regularize a model is to use early stopping. where the model stops training as soon as its validation error reaches its minimum.

Can Logistic Regression Model be used for classification and regression?

Yes

What is Logistic Regression?

Like Linear Regression it calculates probability for each instance and based on the probability it classifies into 0's or 1's. on this Linear Regression a sigmoid function which gives result in 0's and 1's.

What is Decision Boundaries?

it is a boundary between class 0 and 1 which will allow to differentiate between both classes.

04 Training Models Flashcards

(46 cards)