Stats 3 - Multiple Explanatory Variables Flashcards
What are you doing when you are fitting a linear model with multiple explanatory variables?
Fitting a linear model to data when you have multiple explanatory (predictor) variables.
Example
Looking at the influence of trophic level & grounddwelling on genome size

After fitting your linear model, what is good practise to do?
Then we run diagnostic plots to examine our Linear model, which consists of…

Given that the diagnostics plots look okay, what can you proceed to do?
ANOVA –> To examine how much variance is explained by each term.

From the ANOVA plot, how can you calculate R2?
Calculate total ESS = Sum of the Sum sq
Calculate TSS = Sum of Sum sq for both terms and residuals
R2 = ESS / ESS + RSS
Note - using this same logic you can calculate the R2 for each term of the model –> to examine which one explains the most variation.

How to calculate the overal F-Value of the model just from the ANOVA output?
Basically combine all the Sum sq values and combine all their Df’s and we divide this by the RMsq

Break down the ouput produced from a summary of a model.

Outline how you can produce predicted values from the summary output by hand?
- When working with purely categorical data –> we can simply find the ‘mean’ from the summary output by adding the corresponding levels.
- When working with any explanatory variable that is continious, you have to produce an line equation.

What is the syntax associated with explanatory variable interactions?
So basically when we want to include interactions we use * instead of + whne writing up the model

Difference between main effects and interactions?
Main Effect –> Main explanatory variable and it’s influence on the response variable
Interaction –> interactions between the different explanatory variables –> e.g. What influence does grounddwelling (or not) make on the genome size of organisms in the different trophic levels

What should you do if you want to calculate R2 from an ANOVA table with an interaction?
Same procedure as before…
Add up all the ESS Values/ All EES + RSS

How does the model summary() output change when interactions are included?
Same as before but this time the estimates for the interactions are included.
Hence, when calculating the slopes or levels for categorical data you would have to factor them into the equation.

How to calculate the number of co-efficients there are for interactions between two factors?

How to use the predict function in R to predict response variables using our generated model (categorical data)?
- Set up a new dataframe by creating vectors with the different factor combinations and combine the vectors using the data.frame command (Look at image)
Note - Make sure the names of the columns in the dataframe match the model otherwise it won’t work!
predVals
- Use the predict function to predict newdata using our model –> using the following code
predVals$predict

What is an ANCOVA?
Specific type of Linear Model –> combines one categorical variable with one or more continious variables.
Hence, ANCOVA is a Linear model that blends ANOVA and regression
Why is it specific –> look at the influence of categorical variable on the response while controlling for the continious variable
How do we do this?
When setting up our model - we place the explanatory variable first (Order matters!)
odonModel <- lm(logBW ~ Suborder * logGS, data = odonata)
Breakdown this summary output of a interaction model


Breakdown this ANOVA output from the following model..
odonModel


How to use the predict function in R to predict response variables using our generated model (continious data)?
- Set up vectors for a new dataframe with all the corresponding categorical values and continious variables
- Combine all the vectors into a new dataframe called predvals –> NOTE make sure the names of the columns match the names used in the model
- Apply the Predict function –> Note the exp was put infront of the predict fucntion to convert the log values to normal
- Print(Predvals) –> to see answers

Difference between additive model and interaction model?

When constructing an additive model what to we assume
Context –> Categorical and Continious explanatory variable

When constructing an interaction model what to we assume
Context –> Categorical and Continious explanatory variable

When interpreting coefficients from a model summary with continious/categorical data what should you remember?
You have a refrence intercept and a refrence Slope
All the other Intercept values and Slope values are with refrence to the basline values.
So in this case we have mass and three different climate levels –> so we can set up three different equations to measure the impact of each factor level and mass on the Metabolism

Should you report the multiple R2 or the adjusted R2?
Adjusted because it is more comprehensive including the df penalty
Difference between ANCOVA and Multiple linear regression?
Very similar but differences arise in the details
Which one you choose depends on which explanatory variable you would like to focus on!
ANCOVA –> you are primarily interested in is the effect of categorical variable –> they can be used for additive and interaction models
Multiple Linear regression –> you are interested in the effect of the continuous variable
How can we distinguish between these two in R?
The ORDER we put them in when we set up the linear model
Order in R matters –> R focuses on the first variable (maximize sum of squares and assign the rest to the other variable) –> This is shown by the ANOVA tables below –> Climate sum of squares changes depending on whether it is first or last

How to calculate n (sample size) from Df?
Add up all the term Df’s and the residual Df’s and we add 1 to that.