Stats Flashcards
what are factorial designs
one dependent variable
two or more independent variables
an example of a two way factorial design
1 DV 2 IV eg DV - time taken to get to work IV - time of day IV - mode of transport
an example of a three way factorial design
1 DV 3 IV eg DV - proportion recognised IV - diagnosis IV - season IV - stimuli
factorial designs
why are more complex designs with 3 or more factors unusual
complicated to interpret
require large n (between-subjects)
take too long per participant (within-subjects)
when are factorial designs needed
more than one IV contributes to a DV
what do factorial designs tell us
allows us to explore complicated relationships between IVs and DVs
- main effects (how IVs individually affect the DV)
- interactions (how IVs combine to affect the DV)
interpreting factorial design results - main effects
most straight forward result
summaries the data at the level of individual IVs
marginal means
(try add picture from notes)
problem with main effects
can be misleading
main effects can give what we might assume to be the two optimal conditions but these two together may not be the optimal condition
interpreting factorial design results - interactions
we look at them in line charts no interaction = parallel lines interaction = crooked lines special case = crossover interactions -the effect of the IV on the DB reverses dependent on the other IV
what are the three types of factorial anova
between-subjects factorial anova
within-subjects factorial anova
mixed factorial anova (covered in PS2002)
assumptions in factorial anova
interval/ ratio (scale in SPSS)
normally distributed - examine with histogram
homogeneity of variance (for between-subjects) - eyeball SDs, Levene’s test
sphericity of covariance (for within-subjects) - mauchly’s test
what tests for homogeneity of variance
levene’s test
what tests for sphericity of covariance
mauchly’s test
what happens if assumptions are violated for factorial anova
they can withstand some violation
so proceed with caution and report what assumptions have been violated, along with corrected if possible anova results
F-values
how many of these values can there be
on-way factorial anova = 1 F-value
two-way factorial anova = 3 F-values (main effect a, main effect b, interaction axb)
three-way factorial anova = 7 F-values (main effect a, b, c, interactions axb, axc, bxc, axbxc)
how to report multiple F-values
F(between-groups df, within-groups or error df here)=F-value, p=probability
central tendency
s single score that represents the data - mean
dispersion / spread
a measure of validity in the data
s(tandard deviation)=sqrt(sum(x-mean)^2/(N-1))
using means and standard deviations
we can compare a range of measurements using z (standard) scores
z = (score - mean)/SD
we can express how many SD units a point in the normal curve is from the mean using z scores`
why do we use sampling
we cannot test everyone
we make assumptions about how our sample relates t the population based on what we know about sampling theory
what is a population
every single possible observation
fortunately we know populations tend to be normally distributed
explain the central limit theorem
if samples are representative of the popultaion
1 the distribution of all the sample means will approach a normal distribution
2 whilst individual sample means may deviate from the population mean, the mean of all sample means will equal the population mena
3 as the sample size increases, standard deviation of the sampling distribution decreases
as a sample size increases……
we can say with more certainty what the population mean is
standard error of the mean or standard error
SE=SD/sqrt(N)
represents the SD of the sampling distribution. this represents how confident we can be that our sample mean represents the population mean
what are inferential statistics
stats that allow us to make an inference about pop from which our samples are drawn
type 1 and 2 errors
boy who cried wolf
boy commits a type 1 followed by a type 2 error
(cried wolf when no wolf - there is an effect when in fact there is not)
(then no wolf cry when in fact the wolf is there)
to use a t-test what are the parametric assumptions we make about the dta
intervak / ratio (scale in SPSS)
normal distribution - examine histogram
homogenity of variance - levene’s test
3 types of t-tests
single-samples t-test (whether a sample is drawn from a population whose mean we know)
independent samples t-test (two sets of measurements are drawn from the same population in different groups
paired samples t-test (two sets of measurements are drawn from the same population before and after an intervention)
t=…
the general idea
difference between means/ variability in means
non parametric alternatives to single-samples t-test
one sample wilcoxon
signed ranks
single sample t-test df
n-1
non parametric alternatives to independent-samples t-test
mann-whitney u test
independent samples t-test df
n1+n2-2
non-parametric alternatives to paired samples t-test
wilcoxon signed ranks
df for paied samples t-test
df=n-1
how to report t-vales
t(df)=t-value, p=
multiple comparisons problem
computationally inefficient
increases the overall probability of type 1 error = familywise error
between groups variance
deviation of the group means from the grand, overall mean
the greater the group means differ from the grand mean, the bigger this will be
should equal treatment effect +measurement error in an ideal world
with-groups (error) variance
deviation within each groups from each group mean
the more scores within each group vary from each other, the bigger this will be
should only equal measurement error in an ideal world
F-value =….
between groups variance / within-groups variance
F distribution depends on how many df
the degrees of freedom shapes the distribution
2
between groups df
within groups (error) df
pairwise comparisons
significant anova results tell us there is a significant difference somwhere
to tell where we follow up a significant anova result
-how to follow up deoends on what we hypothesised about the pairwise difference
-planned = a priori comparisons
-unplanned = posthoc comparisons
a priori comparisons
if you have specific hypothesis about expected difference amongst conditions
compare only these cases you have a secific hypothesis about
would use specific t-tests
but remember still subject to familywise error, can be corrected by bonferroni correction
post hoc comparisons
if there are no hypotheses about expected difference amongst conditions then compare all cases using an accepted post hoc test - tukey’s HSD
all accepted posthoc tests control for familywise error
Levene’s test - what to do with it
If the significance of the Levene’s statistic is greater than .05, the independent samples have equal
variances and you should use the t-statistic, df and p-value from the top row (equal variances
assumed) of the Independent Samples Test table.
If the significance of the Levene’s statistic is less than .05, the independent samples have unequal
variances and you should use the t-statistic, adjusted df and p-value from the bottom row (equal
variances not assumed) of the Independent Samples Test table.