Stats 3 - Multiple Explanatory Variables Flashcards
What are you doing when you are fitting a linear model with multiple explanatory variables?
Fitting a linear model to data when you have multiple explanatory (predictor) variables.
Example
Looking at the influence of trophic level & grounddwelling on genome size
After fitting your linear model, what is good practise to do?
Then we run diagnostic plots to examine our Linear model, which consists of…
Given that the diagnostics plots look okay, what can you proceed to do?
ANOVA –> To examine how much variance is explained by each term.
From the ANOVA plot, how can you calculate R2?
Calculate total ESS = Sum of the Sum sq
Calculate TSS = Sum of Sum sq for both terms and residuals
R2 = ESS / ESS + RSS
Note - using this same logic you can calculate the R2 for each term of the model –> to examine which one explains the most variation.
How to calculate the overal F-Value of the model just from the ANOVA output?
Basically combine all the Sum sq values and combine all their Df’s and we divide this by the RMsq
Break down the ouput produced from a summary of a model.
Outline how you can produce predicted values from the summary output by hand?
- When working with purely categorical data –> we can simply find the ‘mean’ from the summary output by adding the corresponding levels.
- When working with any explanatory variable that is continious, you have to produce an line equation.
What is the syntax associated with explanatory variable interactions?
So basically when we want to include interactions we use * instead of + whne writing up the model
Difference between main effects and interactions?
Main Effect –> Main explanatory variable and it’s influence on the response variable
Interaction –> interactions between the different explanatory variables –> e.g. What influence does grounddwelling (or not) make on the genome size of organisms in the different trophic levels
What should you do if you want to calculate R2 from an ANOVA table with an interaction?
Same procedure as before…
Add up all the ESS Values/ All EES + RSS
How does the model summary() output change when interactions are included?
Same as before but this time the estimates for the interactions are included.
Hence, when calculating the slopes or levels for categorical data you would have to factor them into the equation.
How to calculate the number of co-efficients there are for interactions between two factors?
How to use the predict function in R to predict response variables using our generated model (categorical data)?
- Set up a new dataframe by creating vectors with the different factor combinations and combine the vectors using the data.frame command (Look at image)
Note - Make sure the names of the columns in the dataframe match the model otherwise it won’t work!
predVals
- Use the predict function to predict newdata using our model –> using the following code
predVals$predict
What is an ANCOVA?
Specific type of Linear Model –> combines one categorical variable with one or more continious variables.
Hence, ANCOVA is a Linear model that blends ANOVA and regression
Why is it specific –> look at the influence of categorical variable on the response while controlling for the continious variable
How do we do this?
When setting up our model - we place the explanatory variable first (Order matters!)
odonModel <- lm(logBW ~ Suborder * logGS, data = odonata)
Breakdown this summary output of a interaction model