Linear Regression Flashcards

1
Q

What is Linear Regression?

A

Linear Regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the equation of Simple Linear Regression?

A

The equation is Y = β0 + β1X + ε, where Y is the dependent variable, X is the independent variable, β0 is the intercept, β1 is the slope, and ε is the error term.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between Simple and Multiple Linear Regression?

A

Simple Linear Regression has one independent variable, while Multiple Linear Regression has two or more independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the objective of Linear Regression?

A

The objective is to find the best-fitting line that minimizes the error between predicted and actual values, usually using the least squares method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the cost function used in Linear Regression?

A

Linear Regression uses the Mean Squared Error (MSE) as the cost function: MSE = (1/n) * Σ (y_i - ŷ_i)^2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you implement Linear Regression using Scikit-Learn?

A
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the fit() method do in Scikit-Learn’s LinearRegression?

A

It trains the model by finding the optimal coefficients (weights) for the linear equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the predict() method do in Scikit-Learn’s LinearRegression?

A

It uses the trained model to predict output values for given input features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is R-squared (R²) in Linear Regression?

A

R² measures how well the model explains the variance in the dependent variable. A value close to 1 indicates a good fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the assumptions of Linear Regression?

A

Linear Regression assumes: 1) Linearity, 2) Independence, 3) Homoscedasticity, 4) Normality of residuals, and 5) No multicollinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Multicollinearity in Linear Regression?

A

Multicollinearity occurs when independent variables are highly correlated, making it difficult to determine their individual effects on the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you detect Multicollinearity?

A

Using the Variance Inflation Factor (VIF). A VIF > 10 suggests high multicollinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can you handle Multicollinearity?

A

By removing highly correlated features, using Principal Component Analysis (PCA), or Ridge Regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between OLS and Gradient Descent?

A

Ordinary Least Squares (OLS) calculates optimal coefficients directly, while Gradient Descent iteratively adjusts coefficients to minimize the cost function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you evaluate a Linear Regression model?

A

Using metrics like R², Mean Squared Error (MSE), and Root Mean Squared Error (RMSE).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you check for Homoscedasticity?

A

By plotting residuals vs. fitted values. A random pattern indicates homoscedasticity, while a funnel shape suggests heteroscedasticity.

17
Q

What is the purpose of the intercept in Linear Regression?

A

The intercept (β0) represents the expected value of Y when all independent variables are zero.

18
Q

How do you extract model coefficients in Scikit-Learn?

A
print(model.coef_)  # Prints the slope coefficients
print(model.intercept_)  # Prints the intercept
19
Q

How do you split data into training and testing sets in Scikit-Learn?

A
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
20
Q

What is Ridge Regression and how does it help?

A

Ridge Regression is a type of Linear Regression that includes an L2 penalty to reduce overfitting by shrinking coefficients.