Statistics week 5 - 11 Flashcards
what are the 3 stage to interpreting SPSS data from two way factorial ANOVA
- ANOVA itself - test of between subjects
- if main effects are significant AND have more than 2 levels then check Post Hoc results
- If interaction result is significant, THEN follow up with Profile plots, interpreting main effect of IV levels and their interaction (parallel lines indicates no interaction)
Assumptions of two-way independant ANOVAs
- normality
- Homogeneity of variance (variance in DV should be equivalent across conditions) (tested with Levenes, no correction).
- Independence of observations
non parametric equivalent for factorial ANOVAs
there isn’t one.
BUT they are really robust and only serious violations would be a problem
diff between partial eta squred and eta squared .
and why is it used in factorial two way ANOVA
eta squared is SSM/SST where in one way anovas, is the same as SSM/SSM+SSR
But in two way anovas this is not true because SST (total of summed squareds) involves all IV levels. BUT partial eta squared only involves one IV level.
i.e. because there are multiple IV levels in Factorial, a measure for each individual IV level is necessary
post hoc tests are relevent when
main effect of IV is significant and IV has more than 2 levels.
difference in assumptions for repeated measures compared to independant. (ANOVA)
How is this assessed
spherecity of covariance
assessed via Mauchlys and corrected via greenhouse geisser
only when IV has more than 2 levels
The range within which 95% of scores in a
normally distributed population fall
formula
95% 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑙𝑢𝑒𝑠 𝑓𝑎𝑙𝑙:
𝜇 ± 1.96*SD
t formula
. 𝑡 =
𝑥̅𝐷/
𝐸𝑆𝐸
df for paired t-test
𝑑𝑓 = (𝑛 − 1)
To calculate degrees of freedom for an
independent t-test
𝑑𝑓 = 𝑛𝑡𝑜𝑡𝑎𝑙 − 2
theory behind how F is calculated
e.g. written out variance formula
𝐹 =
𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝐼𝑉 𝑙𝑒𝑣𝑒𝑙𝑠/
(𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑤𝑖𝑡ℎ𝑖𝑛 𝐼𝑉 𝑙𝑒𝑣𝑒𝑙𝑠−𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑑𝑢𝑒 𝑡𝑜 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙 𝑑𝑖𝑓𝑓𝑠)
Components of the F calculation for
ANOVAs, as provided in SPSS output
𝑆𝑆𝑀 + 𝑆𝑆𝑅 = 𝑆𝑆𝑇
𝑆𝑆𝑀/
𝑑𝑓𝑀
= 𝑀𝑆𝑀 (mean square of model)
𝑆𝑆𝑅/
𝑑𝑓𝑅
= 𝑀𝑆𝑅 (mean square residual)
𝐹 =
𝑀𝑆𝑀/
𝑀𝑆R
To calculate degrees of freedom for a
bivariate correlation
𝑑𝑓 = 𝑁 − 2
R^2 Formula
(measure of effect size): the variance in
the outcome variable that is explained by the
regression model, expressed as a proportion
of total variance
𝑅^2 =
𝑆𝑆𝑀/
𝑆𝑆T
SSR =
sum of squares residual.
take diff between inidiv pp scores for group and that group mean. square and add them. (within groups diff)
SSM =
take diff between indiv group mean and the grand mean. square and add. (between group model)
MSm =
mean square model.
= SSm / dfm
MSr
means sum residual
= SSr / dfr
diff between repeated measures and independent groups factorial ANOVA
no variance due to individual differences (within group variance is smaller)
Marginal means =
mean score for single IV level
what does a significant interaction suggest
effect of IVA on DV is dependant on IVB
strength of bivariate linear correlations
.1-.3 = weak
.4 - .6 = moderate
.7 . 9 = strong
what do inferential statistics measure
infer probability that we have observed a relationship of this magnitude when in fact the H0 is true.
e.g. accept 5% risk of type 1 error / false positive
Parametric assumptions of Bivariate linear relationship
- Both variables must be continuous (if both ordinal (categorical) then use non-parametric) can use for likert scales if have 6 0r 7 points
- Related pairs (each pp have x and y)
- Absence of outliers
- Linearity (scatterplot shows straight and not curved line
non parametric equivalent of Bivariate linear correlation
Spearman’s Rho
what is covariance
variance shared between x and y variable
what does pearsons r value represent
ratio of covariance to separate variances
when talking about relative strength of a relationship, you must report
R^2
if R^2 is .45 , what does this mean (bivariate correlation)
45 % of the variance is shared by x and y variable
partial correlation purpose
allows for examination of relationship without the influence of a 3rd variable
in partial correlation:
look at diff between correlation when not partialed out and when partialed out.
if correlation had decreased but remained significant, suggests
if correlation had not decreased, would suggest
relationship between x and y was partially explained by z
not influenced by z. BUT may be influenced by another variable still.
Regression model purpose
rel between x and y, allowing an estimate of how much y will change as a result of a change in x.
regression model
y =
x =
y = outcome variable. or dependent/criterion variable
x = predictor variable or independent/explanatory variable
why use a regression model
- strength of x and y
- can predict value of y if know x
Assumption of regression model result
- assume y is dependant on x (does not infer causality)
what is the F ratio of the regression model comparing
compares simpest model (average score as a line of best fit (SST)) Vs Best model (the regression line (SSR))
the difference between the two reflects improvement in prediction
The larger the SSM, the
the bigger the improvement
(in prediction model)
assumptions of multiple regression
- Sample size
- Linearity
- outliers
- Multicolinearity (predictors can’t be correlated with one another)
- normal p.p plot of regression
- Scatterplot of regression rectangularly distributed = homoscedasticity
what does hierarchical regression ask
Does adding new predictor variables allow you to explain additional variance in the outcome variable?
Examines influence of predictor variables on outcome variable after “partialing out” influence of other variables.
in a regression model:
- beta represents
standardized slope
what are the different models in hierarchical regression?
model 1 (predictor to be controlled)
model 2 (all predictors)
change statistics for model 1 of hierarchical regression
Compares simplest model (b=0) with model 1.
(same job as standard regression)
change statistics for model 2 of hierarchical regression
compares model 1 to model 2
Tells about explanatory power of x, after effects of z are controlled for.
ΔR^2 = how much overall variance in y is explained by x, after effects of z are controlled for.
ΔF = provides a measure of how much the model has improved the prediction of y, relative to the level of inaccuracy of the model.
Δp if <.05 indicates that x explains significant proportion of y after z is partialed out
higher risk of what error in non-parametric statistics
type 2
e.g.(false negative) risk of failing to reject nul when it is false
e.g. saying it’s not significant when actually it is
Independent t-test non-parametric equivalent
Mann-Whitney U test
remember, man is independent of whitney
paired t-test non-parametric equivalent
Wilcoxon T test
1-way indpendent ANOVA non-parametric equivalent
Kruskal Wallace test
1 way RM ANOVA non-parametric equivalent
Friedman test
remember Layla Friedman is repeated measured
Factorial design non-parametric equivalent
non existant
how are Repeated measures designs shown to be normally distributed
the DV difference scores should be normally distributed, between each paired level of the IV
Normality assumption can be assessed with what test
Shapiro-Wilk test.
use to decide parametric or non-parametric
Pearson’s correlation coefficient non-parametric equivalent
Spearman’s Rho: used when N > 20
Kendall’s Tau: used when N < 20
parametric or non parametric when thinking about scale
if variable measured is ordinal scale, use non-parametric
e.g. if intervals between values is not constant
partial correlation and regression non-parametric equivalent
non existant
what tests analyse categorical data
One variable Chi Square
Chi-square Test of Independence ( two variables)
no parametric equivalents