The Linear Model Flashcards

1
Q

What is the residual (ei) of a model?

A

The (vertical) difference between the estimation and real value. ei = yii

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does it mean when the residual (ei) is lower than 0?

A

The model overestimates the outcome for observation i.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does it mean when the residual (ei) is larger than 0?

A

The model underestimates the outcome for observation i.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is RSS?

A

The RSS is the Residual Sum of Squares. Σ ei2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is the regression called if we estimate the parameters based on RSS?

A

Ordinary Least Squares regression (OLS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the expected relationship between the residuals and the fitted values of a linear model?

A

There should be no relationship, if there would be it would mean that the true model is not a linear model at all.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What behaviour do we expect in a histogram of the residuals for a linear model?

A

The residuals should be symmetric around 0. (high residuals are less likely to occur than low residuals)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is RMSE? and how does the formula look?

A

Root mean squared error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is RSE?

A

Residual standard error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the formula for the Coefficient of Determination or R2?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the Coefficient of Determination or R2 tell us?

A

It explains how much of the total observed variability is acounted for/explained by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is the coefficient of determination (R2)the square of pearson’s correlated coefficient r ?

A

Only if we have one IV in our model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Does R2 = 1 imply we have the true model? i.e. all estimated parameters of the model (ßj) are correct to the truth.

A

No, when R2 = 1 it means that the model accounts for all variability with the data set, however this does not mean we found the model that created the data. f.i. R2 = 1 can always be reached if the model is overfitted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is meant with two caveats?

A

In linear modelling, the residuals, and hence RSS and RMSE, are calculated vertically. If doen horizontally, the new estimated model will differ from the vertical one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 3 main reasons to add more predictors to a model?

A
  1. It reduces the RSS, and hence is more accurate.
  2. It accounts for factors other than the one of interest, and thus adding these other factors eliminates their effect on outcome Y.
  3. If the effect of X on Y is dependent on a third variable, we need to model it explicitly.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When using the linear model with multiple predictors, the R2 should be updated to account for the model complexity. What is the new relation of R2 for the multiple linear model?

A

Where n is the amount of training data, and p is the amount of predictors used.

17
Q

Are linear models based on numerical or categorial variables or both or niether?

A

Typically numerical, but categorial variables can be included using dummy variables.

18
Q

What are dummy variables (z)?

A

It is a way to map categorial predictors into numerical ones for a multi linear model. (f.i. no = 1, yes = 0)

19
Q

How does a dummy variable (z) influence the parameter estimates of a linear model?

A

As it is an ‘binary predictor’ it only influences the intercept and not the slope in the estimated parameters. (ß0 ⇒ ß0 + ß1)

20
Q

What does the estimated value (ß0) of a dummy variable (z) represent?

A

It represents the relative change in intercept, compared to the original intercept. (ß0)

21
Q

How many dummy levels are expected for an categorial variable with 5 levels?

A

4, there is always one level incorperated into the baseline or reference level (the intercept ß0)

22
Q

Is it possible for a linear model to have no intercept level?

A

Yes, when the intercept level is devided into a set of dummy variables of which only one can be active simultaneously. Then this dummy ‘parameter’ now acts as the intercept.

23
Q

Answer this: What effects are displayed and which linear model explains the behavious of the coloured slopes.

A

One intercepts and two slopes: we study the interaction effect and only the main effect of the numerical variable.

Y = ß0 + (ß1 + ß2 xinsurence = no) x1

24
Q

Answer this: What effects are displayed and which linear model explains the behavious of the coloured slopes.

A

Two intercepts and two slopes: we study the main- and interaction effects of the numerical and nominal variable. Y = ß0 + ß1x1 + (ß2 + ß3x1) xinsurence = no

25
Q

Answer this: What effects are displayed and which linear model explains the behavious of the coloured slopes.

A

Two intercepts but only one common slope: we study only the main effects of the numerical and nominal variable. Y = ß0 + ß1x1 + ß2xinsurence = no

26
Q

What is the definition of a linear model?

A

When the relationship between the predictors and response is linear, even tho the predictors itself are not.

27
Q

Is this a linear model?

A

No, relationships between predictors and response is non-linear.

28
Q

Is this a linear model?

A

Yes, relationships between predictors and response is linear.