Stats 4 - Model Selection Flashcards

1
Q

When creating a model, are all the model terms equally important?

A

No!

Model terms are NOTequally important!

For example, we may remove some explanatory terms which will result in a decrease in explanatory power of the model, but it may not make it significantly worse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the goal of model selection?

A

Simple as possible but at the same time not sacrificing too much explanatory power

“Everything should be made as simple as possible, but no simpler.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Outline the general process of Model simplification.

A
  1. Start with maximal model –> the model that contains everything that might be important –> include all the terms & interactions that seem relevant biologically
  2. Simplify our maximal model towards the null model –> states that nothing is important/explains the response varaible
  3. But on the journey to the Null model, we reach a point somewhere in between where you can’t remove any further terms without making the model significantly worse: this is called the minimum adequate model.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the different parts of the attached flow diagram.

A

Model selection is an iterative process

  1. Current model
  2. Make list of valid terms to drop (you can’t just drop any term)
  3. Remove the least significant term –> Term with the lowest explanatory power (Sum Sq)
  4. Creates new model
  5. Compare the current model to the new model using anova(Model1,Model2)
  6. Is it statistically worse/different?

F-test –> Linear model/AIC –> Non-Linear Model

a) No statistical signficance –> New model becomes new current model
b) Statistical significance (not good - significant reduction in explanatory power) –> remove the term from the list of possible terms that can be removed
7. Are there any more valid terms to drop?

Yes –> Continue simplification

No –> Minimum adequete model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to construct the maximum model?

A

Make a model that includes all the explanatory power with all the different possible interactions

Example:

  • Explanatory variables:

GroundDwelling (Categorical)

Trophic Level (Categorical)

Litter size (offspring produced at one birth) (Continuous)

Body Mass (Continuous)

  • Response variable:

Genome size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a quick way to construct a model that only shows pairwise interaction?

A

To only include only pairwise comparisons - use the following command

y ~ (a + b + c)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When performing model simplification, what is the rule for the terms that you are allowed to drop?

A

Obviously, you only want to remove the non-significant terms

Rule –> You cannot remove a main effect or an interaction while those main effects or interactions are present in a more complex interaction.

So if we have a complex interaction which we want to keep, we cannot just simply remove its constituent main effects/interactions

Rule of Thumb –> Start by removing terms from the bottom of summary output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you know how much you can simplify your model?

A

Each time you drop a term –> the model gets worse  since the sum of squares are no longer explained (ESS explains less of the observed variation) –> the remaining variables may compensate for the loss of explanatory power

Takeaway message –> Model gets a little worse? Its Okay! –> tiny amount explained by the removed term is not worth it –> makes the model unnecessarily complicated

But wait a minute?!?!? How do we know how much a tiny amount is???

Use the the F-Test (anova) –> If the F-Test shows significance (P<=0.05) –> there has been a significant reduction in explanatory power –> you should NOT remove the term

But…

If the F-Test shows NO statistical significance (P>0.05) –> then you can proceed to remove the term.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the drope.scope function allow you to do?

A

The drope.scope function is a function in R that tells you what terms you can drop from your model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Without having to completely re-write your model, what is a shortcut you can use to update your model?

A

Short cut for removing a term from our model using the update function.

Its like you are telling R what to change in function ‘f’ on either side of the ~ symbol. The dots in the code (. ~ .) mean ‘use whatever is currently in the response or explanatory variables’

Remember to rename your new updated model

Newname <- update(…..)

Otherwise you will lose your old model and you will NOT be able to compare

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What should you always do after removing a term from your current model?

A

Run an ANOVA

To compare the Current and New model!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Breakdown the following ANOVA output that compares Model 1 and Model 2.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two things you should consider when dropping a variable?

What term would you drop from the attached ANOVA output?

A
  1. Can you even drop that variable –> is the interaction/main effect present in any more complex interactions?

Check using drope.scope function or use rule of thumb - start at the bottom of the table

  1. Examine the sum of sq from the ANOVA (Model) output –> what main effect/interaction has the lowest sum of sq (least explanatory power)

What term would you drop?

logBM:Trophiclevel –> Lowest of Sum Sq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly