Stats 4 - Model Selection Flashcards

Question 1

Q

When creating a model, are all the model terms equally important?

Answer

A

No!

Model terms are NOTequally important!

For example, we may remove some explanatory terms which will result in a decrease in explanatory power of the model, but it may not make it significantly worse

Question 2

Q

What is the goal of model selection?

Answer

A

Simple as possible but at the same time not sacrificing too much explanatory power

“Everything should be made as simple as possible, but no simpler.”

Question 3

Q

Outline the general process of Model simplification.

Answer

A

Start with maximal model –> the model that contains everything that might be important –> include all the terms & interactions that seem relevant biologically
Simplify our maximal model towards the null model –> states that nothing is important/explains the response varaible
But on the journey to the Null model, we reach a point somewhere in between where you can’t remove any further terms without making the model significantly worse: this is called the minimum adequate model.

Question 4

Q

Explain the different parts of the attached flow diagram.

Answer

A

Model selection is an iterative process

Current model
Make list of valid terms to drop (you can’t just drop any term)
Remove the least significant term –> Term with the lowest explanatory power (Sum Sq)
Creates new model
Compare the current model to the new model using anova(Model1,Model2)
Is it statistically worse/different?

F-test –> Linear model/AIC –> Non-Linear Model

a) No statistical signficance –> New model becomes new current model
b) Statistical significance (not good - significant reduction in explanatory power) –> remove the term from the list of possible terms that can be removed
7. Are there any more valid terms to drop?

Yes –> Continue simplification

No –> Minimum adequete model

Question 5

Q

How to construct the maximum model?

Answer

A

Make a model that includes all the explanatory power with all the different possible interactions

Example:

Explanatory variables:

GroundDwelling (Categorical)

Trophic Level (Categorical)

Litter size (offspring produced at one birth) (Continuous)

Body Mass (Continuous)

Response variable:

Genome size

Question 6

Q

What is a quick way to construct a model that only shows pairwise interaction?

Answer

A

To only include only pairwise comparisons - use the following command

y ~ (a + b + c)^2

Question 7

Q

When performing model simplification, what is the rule for the terms that you are allowed to drop?

Answer

A

Obviously, you only want to remove the non-significant terms

Rule –> You cannot remove a main effect or an interaction while those main effects or interactions are present in a more complex interaction.

So if we have a complex interaction which we want to keep, we cannot just simply remove its constituent main effects/interactions

Rule of Thumb –> Start by removing terms from the bottom of summary output

Question 8

Q

How do you know how much you can simplify your model?

Answer

A

Each time you drop a term –> the model gets worse  since the sum of squares are no longer explained (ESS explains less of the observed variation) –> the remaining variables may compensate for the loss of explanatory power

Takeaway message –> Model gets a little worse? Its Okay! –> tiny amount explained by the removed term is not worth it –> makes the model unnecessarily complicated

But wait a minute?!?!? How do we know how much a tiny amount is???

Use the the F-Test (anova) –> If the F-Test shows significance (P<=0.05) –> there has been a significant reduction in explanatory power –> you should NOT remove the term

But…

If the F-Test shows NO statistical significance (P>0.05) –> then you can proceed to remove the term.

Question 9

Q

What does the drope.scope function allow you to do?

Answer

A

The drope.scope function is a function in R that tells you what terms you can drop from your model

Question 10

Q

Without having to completely re-write your model, what is a shortcut you can use to update your model?

Answer

A

Short cut for removing a term from our model using the update function.

Its like you are telling R what to change in function ‘f’ on either side of the ~ symbol. The dots in the code (. ~ .) mean ‘use whatever is currently in the response or explanatory variables’

Remember to rename your new updated model

Newname <- update(…..)

Otherwise you will lose your old model and you will NOT be able to compare