Part 2: Regression Analysis Flashcards

Question 1

Q

Regression analysis

Answer

A

The analysis of the statistical relationship among variables. In the simples form there are only 2 variables:

Dependent/response variable (Y)
Independent/predictor variable (X)

Question 2

Q

Simple linear regression

Answer

A

Y = a + bX + e
a = intercept component to the model that represents the models value for Y for X=0.
b = specifically denotes the slope of the linear equation that specifies the model.
e = a term that represents the errors associated with the model.

Question 3

Q

R^2

Answer

A

The coefficient indicating goodness of fit (with max = 1). When R^2 increases, the fit of the model increases as well, but there is also more modelling noise. Proportion of variation in Y ‘explained’ by all X variables in the model.

R^2 = 0 when the second term is equal to 1. Which means that the estimated value is equal to the average.
R^2 = 1 when the second term is equal to 0. Which means that the estimated Y is always Y (y - ^y = 0). In this case the model does not have any error at all.
Can R^2 be negative? Yes, when you have very big errors.

Question 4

Q

Ordinary least squares (OLS)

Answer

A

Method for finding the model with the best fit. It minimizes the errors associated with predicting the values for Y. It issues a least squares criterion because without square we would allow positive and negative deviations from the models to cancel each other out.

Question 5

Q

What is OLS often used for.?

Answer

A

Hedonic price models

Question 6

Q

Collinearity

Answer

A

Some independent variable that depend on another independent variable in the model.

Question 7

Q

Multiple regression model

Answer

A

Y = b0x0 + b1x1 + b2x2 + … + bNxN + e

Question 8

Q

Neural network

Answer

A

Non-linear multiple regression model

Question 9

Q

Adjuster R^2

Answer

A

Compensates for the number of explanatory variables, penalty for extra variables. R^2 never decreases when a new X variables is added to the model, which may cause overfitting. To avoid overfitting you can use 2 sets, training set and validation set.

Question 10

Q

Model selection

Answer

A

2 ways:

Forward selection: you start with one variable and keep on adding more variables until you have the prober R^2 without noise.
Backward selection: start with a large set of variables and keep deleting variables that harm your model.