Chapter 5 Flashcards

Question 1

Q

Backward Elimination

Answer

A

Backward elimination is a quantitative approach to identify the independent variable to include in a model. It starts with all the independent variables in the model first and each one is deleted one at a time if they are not significant. The process stops when all variables in the model are significant.

Question 2

Q

Dependent Variable

Answer

A

The variable being predicted is referred to as the dependent (target) variable (Y).

Question 3

Q

Dummy Coding

Answer

A

Dummy coding involves creating a dichotomous value from a categorical value.

Question 4

Q

Feature Selection

Answer

A

Feature selection refers to identifying the optimal subset of features (independent variables) to explain a target variable. Feature selection can be done quantitatively or qualitatively.

Question 5

Q

Forward Selection

Answer

A

Forward selection is a quantitative approach to identify the independent variable to include in a model. A separate regression model is created for each predictor and then variables are added one by one to determine which variables improve the model prediction.

Question 6

Q

Independent Variable

Answer

A

The variables used to make the prediction are called independent variables (X) (also referred to as predictors or features).

Question 7

Q

Linear Regression

Answer

A

Linear regression is a type of modeling that shows a relationship between the independent and dependent variables. It is represented by a straight line that best fits the data.

Question 8

Q

Mean Absolute Error

Answer

A

Mean Absolute Error (MAE) measures the absolute difference between the predicted and actual values of the model.

Question 9

Q

Mean Absolute Percentage Error

Answer

A

Mean Absolute Percentage Error (MAPE) is the percentage absolute difference the prediction is, on average, from the actual target.

Question 10

Q

Multicollinearity

Answer

A

Multicollinearity is a situation where the predictor variables are highly correlated with each other.

Question 11

Q

Multiple Regression

Answer

A

Multiple regression is used to determine whether two or more independent variables are good predictors of the single dependent variable.

Question 12

Q

Numerical Variable

Question 13

Q

Ordinary Least Squares

Answer

A

The ordinary least squares (OLS) regression method commonly referred to as linear regression minimizes the sum of squared errors.

Question 14

Q

Overfitting

Answer

A

Overfitting occurs from an overly complex model where the results are limited to the data being used and are not generalizable—which means future relationships cannot be inferred, and results will be inconsistent when using other data.

Question 15

Q

R^2

Answer

A

R^2 measures the amount of variance in the dependent variable that is predicted by the independent variable(s). The R^2 value ranges between 0 and 1, and the closer the value is to 1, the better the prediction by the regression model. When the value is near 0, the regression model is not a good predictor of the dependent variable.

Question 16

Q

Regression Modeling

Answer

Study These Flashcards

A

Regression modeling captures the strength of a relationship between a single numerical dependent or target variable, and one or more (numerical or categorical) predictor variables.

Question 17

Q

Residuals

Answer

Study These Flashcards

A

Residuals represent the difference between the observed and predicted value of the dependent variable.

Question 18

Q

Root Mean Squared Error

Answer

Study These Flashcards

A

Root Mean Squared Error (RMSE) indicates how different the residuals are from zero.

Question 19

Q

Simple Linear Regression

Answer

Study These Flashcards

A

Simple linear regression is used when the focus is limited to a single, numeric dependent variable and a single independent variable.

Question 20

Q

Stepwise Selection

Answer

Study These Flashcards

A

Stepwise selection is a quantitative approach to identify the independent variable to include in a model. Stepwise selection follows forward selection by adding a variable at each step, but also includes removing variables that no longer meet the threshold. The stepwise selection stops when the remaining predictors in a model satisfy the threshold to remain in the model.