Model Checking Flashcards

1
Q

Var[e_i]=?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Cov[e_i,e_j]=

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Standardize residuals

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Studentise residuals

A

Replace σ^2 in Standardized residual with S^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Constant varíance?

A

Homoscedasticity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

To check linearity of model

A

Plot r_i against x_i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

To check homoscedasticity

A

Plot r_i against y^^_i (fitted valúes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

h_ii =

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Rule of thumb for outlier observations when standardised

A

If abs value >2, outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Large leverage?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Very large leverage

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cook’s distance?

A

Statistic to measure influence of an observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Determine if cook’s stat is unusually large?

A

If D_i is bigger than 50th percentile of(where p is #parameters):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Pure error?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Replications

A

More than one observation for some valúes of an explanatory variable;
Y_ij for x_i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q
A

When múltiple observations at single x_i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Sum of squares for residuals (Y_ij for x_i)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Puré error sum of squares

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Lack of fit sum of squares

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In SLRM SS_E =

A

SS_LoF + SS_PE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

ANOVA table columns

A

Source of variation, d.f., SS, MS, VR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

E(SS_PE) =

A

(N-m)σ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

If SLRM is true then E(SS_LoF) =

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

MS_PE and MS_LoF give estimators?

A

Both give unbiased estimators of var
But latter only if SLRM is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

F test for lack of fit:
-H_0?

A

SLRM is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

F test for lack of fit:
H_1?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

F test for lack of fit:
-2Stats?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

F test for lack of fit:
-F stat under H_0

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Can only ro F test for LoF if

A

We have replications (not repeated measurements of same sampling unit)

31
Q
A

0 vector

32
Q
A
33
Q

h_ii is?

A

ite diagonal element of Hat matrix

34
Q
A
35
Q

ith mean response? (Matrix)

A
36
Q

Estimator of ith mean response

A
37
Q

Varíance of estimator of ith mean response

A
38
Q

Estimator of varíance of estimator of ith mean response

A
39
Q
A
40
Q
A
41
Q

Varíance of estimator of beta zero

A
42
Q
A
43
Q
A
44
Q

Vector of residuals

A
45
Q

Múltiple regression model written in vectors

A
46
Q

Vector of fitted valúes

A
47
Q

ith fitted valúe? (Vector)

A
48
Q

h_ii indicates?

A
49
Q

Properties of h_ii:
As var(e_i) = σ^2 (1-h_ii)

A
50
Q

Properties of h_ii:
h_ii is usually small/large when?

A
51
Q

Centroid?

A

The vector of means of each feature across all data points

52
Q

Properties of h_ii:
When p=2?

A

SLRM,

53
Q

Properties of h_ii:
Range of value for h_ii

A

1/n < h_ii < 1

54
Q

Properties of h_ii:
Sum of h_ii

A
55
Q

Average leverage

A

p/n

56
Q

High leverage

A

h_ii > 2p/n

57
Q

Very high leverage

A

h_ii > 3p/n

58
Q

Cooks distance in vectors

A
59
Q
A
60
Q

Cooks distance in vectors (reduced)

A
61
Q

PRESS residuals

A

PRediction Error Sum of Squares

62
Q

PRESS

A

Sum of squares of press residuals

63
Q

PRESS residuals simplified

A
64
Q

What does PRESS assess?

A

The model’s predictive ability
-used for calculating predicted R^2

65
Q

Predicted R^2 defined?

A
66
Q

When is Predicted R^2 used?

A

In MLRM to indicate how well the model predicts responses to new observations

67
Q

A Good modele ould have R^2

A

And R^2(pred) both high and close to each other

68
Q

Large discrepancy in R^2 and R^2(pred)

A

Means model May b over fitted

69
Q

If (below) is singular, then?

A

No uni que least square estimators exist

70
Q

Singularity of below caused by

A

Linear dependence among explanatory variables

71
Q

Problems of Multicollinearity

A

-some or all estimators will have large variances
-very different models May fit equally well therefore variable selection may b difficult
- some params May have wrong sign

72
Q

What is use of varíance inflación factor?

A

Used to indicate when multi collinearity May b a problem

73
Q

VIF_j=?

A

For regression w p-1 predictors, model with X_j as function of remaining p-2 exp variables, R^2_j coefficient of determination (not as a %)

74
Q

The scatterplot of the residuals vs fitted values is useful to check if

A

-the variance of the error is constant
- if there is any trend in the residuals (thus some term is missing in the regression model)
- if there is any outlier