Regression Flashcards
Includes simple and multiple linear regression, gradient descent algorithm, overfitting and underfitting.
What is regression?
It is the process of determining the relation between one or more independent variables and the output variable is continuous.
eg: house price prediction, used car price prediction, mark prediction etc.
What are the types of regression?
- Simple linear regression
- Multiple Linear regression
- Polynomial regression
What is simple linear regression?
In simple linear regression, there is only one independent variable and one dependent variable.
If x is the independent variable and y is the dependent variable, then, the relation between x and y can be modeled by:
y=a+bx
What is Multiple Linear Regression?
Here, we assume that there are N independent variables and one dependent variable. So, the value of the dependent variable can be predicted from the N independent variables. It is represented as:
y=Oo+O1x1+O2x2+…+Onxn
What is polynomial regression?
Let x be the predictor variable and y be the dependent variable. Then, the polynomial regression can be applied as follows:
y=a0+a1x+a2x^2+…+anx^n
What is the Gradient Descent Algorithm?
Gradient descent is an optimization algorithm that minimizes the cost function in linear regression. It’s a first-order iterative algorithm that finds a local minimum or maximum of a function.
What are the steps involved in the Gradient Descent Algorithm?
- Initialize b0 =0, b1=0, l=.01
- find the derivative of the cost function(MSE) w.r.t b0 and b1.
- Update b0=b0-lDb0.
and b1=b1-lDb1 - Find the MSE. Repeat till MSE~0
Explain overfitting in regression?
Overfitting occurs when the model is trained too well on the training data, such that it even captures the noise and the fluctuations present in the data along with the underlying patterns.
As a result, The model performs too well in training data but fails to generalize effectively on new unseen data.
In the context of regression, overfitting can be explained with the following points:
1. capturing noise.
2. High variance.
3. Poor Generalization.
The overfitting In regression can be prevented by the following steps:
- Regularization
- cross-validation
- Feature Selection
- Increasing data size
What is logistic regression?
It is used when the dependent variable is binary. Logistic regression uses a logistic function, also called the sigmoid function, to map real-valued numbers into values between 0 and 1. The S-shaped curve formed by the logistic function is called the sigmoid function.