Stats 3 - Multiple Explanatory Variables Flashcards

1
Q

What are you doing when you are fitting a linear model with multiple explanatory variables?

A

Fitting a linear model to data when you have multiple explanatory (predictor) variables.

Example

Looking at the influence of trophic level & grounddwelling on genome size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

After fitting your linear model, what is good practise to do?

A

Then we run diagnostic plots to examine our Linear model, which consists of…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Given that the diagnostics plots look okay, what can you proceed to do?

A

ANOVA –> To examine how much variance is explained by each term.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

From the ANOVA plot, how can you calculate R2?

A

Calculate total ESS = Sum of the Sum sq

Calculate TSS = Sum of Sum sq for both terms and residuals

R2 = ESS / ESS + RSS

Note - using this same logic you can calculate the R2 for each term of the model –> to examine which one explains the most variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to calculate the overal F-Value of the model just from the ANOVA output?

A

Basically combine all the Sum sq values and combine all their Df’s and we divide this by the RMsq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Break down the ouput produced from a summary of a model.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Outline how you can produce predicted values from the summary output by hand?

A
  1. When working with purely categorical data –> we can simply find the ‘mean’ from the summary output by adding the corresponding levels.
  2. When working with any explanatory variable that is continious, you have to produce an line equation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the syntax associated with explanatory variable interactions?

A

So basically when we want to include interactions we use * instead of + whne writing up the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Difference between main effects and interactions?

A

Main Effect –> Main explanatory variable and it’s influence on the response variable

Interaction –> interactions between the different explanatory variables –> e.g. What influence does grounddwelling (or not) make on the genome size of organisms in the different trophic levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What should you do if you want to calculate R2 from an ANOVA table with an interaction?

A

Same procedure as before…

Add up all the ESS Values/ All EES + RSS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the model summary() output change when interactions are included?

A

Same as before but this time the estimates for the interactions are included.

Hence, when calculating the slopes or levels for categorical data you would have to factor them into the equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calculate the number of co-efficients there are for interactions between two factors?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to use the predict function in R to predict response variables using our generated model (categorical data)?

A
  1. Set up a new dataframe by creating vectors with the different factor combinations and combine the vectors using the data.frame command (Look at image)

Note - Make sure the names of the columns in the dataframe match the model otherwise it won’t work!

predVals

  1. Use the predict function to predict newdata using our model –> using the following code

predVals$predict

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an ANCOVA?

A

Specific type of Linear Model –> combines one categorical variable with one or more continious variables.

Hence, ANCOVA is a Linear model that blends ANOVA and regression

Why is it specific –> look at the influence of categorical variable on the response while controlling for the continious variable

How do we do this?

When setting up our model - we place the explanatory variable first (Order matters!)

odonModel <- lm(logBW ~ Suborder * logGS, data = odonata)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Breakdown this summary output of a interaction model

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Breakdown this ANOVA output from the following model..

odonModel

A
17
Q

How to use the predict function in R to predict response variables using our generated model (continious data)?

A
  1. Set up vectors for a new dataframe with all the corresponding categorical values and continious variables
  2. Combine all the vectors into a new dataframe called predvals –> NOTE make sure the names of the columns match the names used in the model
  3. Apply the Predict function –> Note the exp was put infront of the predict fucntion to convert the log values to normal
  4. Print(Predvals) –> to see answers
18
Q

Difference between additive model and interaction model?

A
19
Q

When constructing an additive model what to we assume

Context –> Categorical and Continious explanatory variable

A
20
Q

When constructing an interaction model what to we assume

Context –> Categorical and Continious explanatory variable

A
21
Q

When interpreting coefficients from a model summary with continious/categorical data what should you remember?

A

You have a refrence intercept and a refrence Slope

All the other Intercept values and Slope values are with refrence to the basline values.

So in this case we have mass and three different climate levels –> so we can set up three different equations to measure the impact of each factor level and mass on the Metabolism

22
Q

Should you report the multiple R2 or the adjusted R2?

A

Adjusted because it is more comprehensive including the df penalty

23
Q

Difference between ANCOVA and Multiple linear regression?

A

Very similar but differences arise in the details

Which one you choose depends on which explanatory variable you would like to focus on!

ANCOVA –> you are primarily interested in is the effect of categorical variable –> they can be used for additive and interaction models

Multiple Linear regression –> you are interested in the effect of the continuous variable

How can we distinguish between these two in R?

The ORDER we put them in when we set up the linear model

Order in R matters –> R focuses on the first variable (maximize sum of squares and assign the rest to the other variable) –> This is shown by the ANOVA tables below –> Climate sum of squares changes depending on whether it is first or last

24
Q

How to calculate n (sample size) from Df?

A

Add up all the term Df’s and the residual Df’s and we add 1 to that.