Model Checking Flashcards by tyrion lannister

Q

Var[e_i]=?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Cov[e_i,e_j]=

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Standardize residuals

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Studentise residuals

A

Replace σ^2 in Standardized residual with S^2

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Constant varíance?

A

Homoscedasticity

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

To check linearity of model

A

Plot r_i against x_i

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

To check homoscedasticity

A

Plot r_i against y^^_i (fitted valúes)

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

h_ii =

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Rule of thumb for outlier observations when standardised

A

If abs value >2, outlier

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Large leverage?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Very large leverage

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Cook’s distance?

A

Statistic to measure influence of an observation

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Determine if cook’s stat is unusually large?

A

If D_i is bigger than 50th percentile of(where p is #parameters):

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Pure error?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Replications

A

More than one observation for some valúes of an explanatory variable;
Y_ij for x_i

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

A

When múltiple observations at single x_i

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Sum of squares for residuals (Y_ij for x_i)

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Puré error sum of squares

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Lack of fit sum of squares

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

In SLRM SS_E =

A

SS_LoF + SS_PE

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

ANOVA table columns

A

Source of variation, d.f., SS, MS, VR

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

E(SS_PE) =

A

(N-m)σ^2

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

If SLRM is true then E(SS_LoF) =

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

MS_PE and MS_LoF give estimators?

Both give unbiased estimators of var But latter only if SLRM is true

F test for lack of fit: -H_0?

SLRM is true

F test for lack of fit: H_1?

F test for lack of fit: -2Stats?

F test for lack of fit: -F stat under H_0

Can only ro F test for LoF if

We have replications (not repeated measurements of same sampling unit)

0 vector

h_ii is?

ite diagonal element of Hat matrix

ith mean response? (Matrix)

Estimator of ith mean response

Varíance of estimator of ith mean response

Estimator of varíance of estimator of ith mean response

Varíance of estimator of beta zero

Vector of residuals

Múltiple regression model written in vectors

Vector of fitted valúes

ith fitted valúe? (Vector)

h_ii indicates?

Properties of h_ii: As var(e_i) = σ^2 (1-h_ii)

Properties of h_ii: h_ii is usually small/large when?

Centroid?

The vector of means of each feature across all data points

Properties of h_ii: When p=2?

SLRM,

Properties of h_ii: Range of value for h_ii

1/n < h_ii < 1

Properties of h_ii: Sum of h_ii

Average leverage

p/n

High leverage

h_ii > 2p/n

Very high leverage

h_ii > 3p/n

Cooks distance in vectors

Cooks distance in vectors (reduced)

PRESS residuals

PRediction Error Sum of Squares

PRESS

Sum of squares of press residuals

PRESS residuals simplified

What does PRESS assess?

The model’s predictive ability -used for calculating predicted R^2

Predicted R^2 defined?

When is Predicted R^2 used?

In MLRM to indicate how well the model predicts responses to new observations

A Good modele ould have R^2

And R^2(pred) both high and close to each other

Large discrepancy in R^2 and R^2(pred)

Means model May b over fitted

If (below) is singular, then?

No uni que least square estimators exist

Singularity of below caused by

Linear dependence among explanatory variables

Problems of Multicollinearity

-some or all estimators will have large variances -very different models May fit equally well therefore variable selection may b difficult - some params May have wrong sign

What is use of varíance inflación factor?

Used to indicate when multi collinearity May b a problem

VIF_j=?

For regression w p-1 predictors, model with X_j as function of remaining p-2 exp variables, R^2_j coefficient of determination (not as a %)

The scatterplot of the residuals vs fitted values is useful to check if

-the variance of the error is constant - if there is any trend in the residuals (thus some term is missing in the regression model) - if there is any outlier