Lecture 9: Statistical Tests V: ANCOVA, Complex Models & Simplification Flashcards
is anova parametric or nonparametric?
anova is a parametric statistical test
what assumptions are identical to those of ANOVA?
the assumptions in ANOVA and linear regression are identical
why not have many models with just one explanatory variable? (why ancova and just not many anova?)
(1) explanatory variables might influence each other
(2) multiple test on the same data sets should be avoided
ancova meaning:
analysis of covariance
what does ancova combine?
ancova combines elements of linear regression and anova
type of response and explanatory variable in ancova:
response variable in ANCOVA testing is continuous
explanatory variableS are both categorical and continuous
what is the ANCOVA process:
(1) number of factor levels (or categories) = number of linear regressions
(2) for each of those factor-levels (categories) we estimate the slope and intercept [similar to linear regression]
(3) model simplification (using the principle of parsimony - simpler is better)
principle of parsimony in statistical terms:
- should be simplified until minimally adequate with as few parameters as possible
first line for model simplification command in R when you have interactive affects between groups (2 slopes, 2 intercepts):
asterics between explanatory varibales shows there is an interaction between them also the individual effects of age and sex
(1) > m1<-aov(y-variable ~ explanatory-variable-1 * explanatory-varibale-2)
you could instead write the below line of code with (:) and (+) instead of an asterisks however for now the asterisks will cover it in less amount of code
first line for model simplification command in R when you have additive affects between groups (1 slope [parallel], 2 intercepts):
m2<-aov(response-variable ~ explanatory-1 + explanatory-2)
when we have 1 slope the slopes are the same, how can we tell this?
we can tell if only one slope is present if the two slopes are parallel to one another - [parallel - just one slope]
what symbol do we use to connect interactive explanatory variables?
asterisks ( * )
e.g: explanatory-1 * explanatory-2
what symbol do we use to connect additive explanatory variables?
plus (+)
e.g: explanatory-1 + explanatory-2
how do we simplify our model if we have [1 slope, 1 intercept] or [2 intercepts, 0 slope]?
in this case you simply apply a normal ANOVA test as there is only one significant explanatory variable that has an effect
> m3 <- (response variable ~ explanatory variable)
what do we do if we have a graph in which neither of our explanatory variables have an effect on our response variable?
we simple create a model where:
> m4 <- aov(weight ~ 1)
because we want the “minimal adequate model” how can we test to see if we can simplify our model given we initially have the more complicated > m1<-aov(y~x*x) model?
you create two models - m1 & m2 where m1 is the complicated asterisks model and m2 is the less complicated plus model
you input both of these models and then you can put those aov models 1 & 2 into the following command
> anova(m1,m2)
if the p-value is bigger than 0.05 this means we CAN simplify to the plus model as the explanatory difference is not significantly less than the asterisks
if the p value is smaller than 0.05 we cannot simplify the asterisks model to the addition model as the explanatory power of the simpler model is significantly less
SIMPLE model simplification procedure:
(1) fit the maximal model [through graphical slope & intercept interpretation]
(2) start model simplification
(3) stop model simplification when minimal adequate model is reached [do this via > anova(m1,m2) and looking for p > 0.05]
anova and linear regression are identical except for (X) and the R-code is (Y)
(X) the type of explanatory variable
(Y) identical for both of these tests
what statistical tests are all somewhat the same in their r-code?
one-way-anova, factorial anova, ancova & linear regression