Stats Flashcards
between experimental designs
different participants in each condition
so difference between groups
within subjects
the same participants in each condition
difference between treatments
similarities of how experiments are ran both between and within
nonexperimental conditinos held constant
dependent variabel measured identically
different formula used in statistical tests for these designs
what are factorial designs
designs with
- one dv
- two or more independent variables (unlike t-tests and one way ANOVA)
when are factorial designs needed
we suspect more than one iv is contributing to a dv
ignoring a dv detracts from the explanatory power of our experiments
what do factorial designs tell us
allow us to explore complicated relationsips between ivs and dvs
what is a main effect
how IVs factors individually affect the DV
what is an interaction
how IVs combine to affect the DV
limitations of between subjects design
participant variables
lots of participants required
limitations of within subjects design
practice effect (lack of naeivity) longer testing sessions
assumptions in mixed factorial ANOVA
mix of between and within subject assumptions: -interval/ ration (scale in spss) normal distribution homogenity of variance sphericity of covariance
how to test for normal distribution assumption
examine histogram
conduct a formal test of normality - Kolmogorov-Smirnov test
how to test for homogenity of variance assumption
eyeball SDs
Levene’s test
how to test for sphericity of covariance
Mauchly’s test
rules of mixed factorial ANOVA
identify straight away the between and within IV
use between subject formulae for between-subject effects and within for within-subject effects
if there is a conflict (eg interactions) use the within
how to report mixed factorial ANOVA
F(between group df, within or error df here)=F-value, p=
what are tests of association
tests the relationships between variables
usually performed on continuous variables
examples of tests of association (eg parametric, non-parametric etc
pearson's correlation (parametric) spearman's correlation (nonparametric) point-biserial correlation simple linear regression multiple regressions
what do tests of association tell us
they tell us whether variables covary with other variables
what limits us in tests of association (methodological)
without expreimental manipulation, we cannot infer causation
what do scatterplots do
typically show relationships between pairs of variables
- data from each variable are plotted on separate axis
- each point represents one pair of observations at each measure point
what does the direction of the cloud of points in a scatterpoint tell us
an indication of the direction of the relationship
what is the spread and what does it tell us in a scatter plot
how close the points are to forming a line
gives an indication of the strength of a relationship
Assumptions when running pearsons correlation
-we should be looking for a linear relationship between variables
check the scatterplot, if it shows a clear non-linear relationship, do not run a pearson’s correlation
-parametric tests assumes interval/ratio data
-normal distribution
-data should be free of statistical outliers
why does data have to be normally distributed to run pearsons
how to check
involves calculating means and SDs
only appropriate if data is normally distributed
plot and inspect a frequency distribution of scores for each variable
can take some skew
why must outliers be excluded from analysis in pearsons
outliers have a disproportionate influence on the correlation statistic or correlation coefficient r
facts about correlation coefficients
range from -1 to 1
no units
same for xand y as y and x
positive value indicates as one value increases, so does the other
negative value indicates as one variable increases, the other decreases
how close a value is to -1 or +1 indicates how close the two variables are to being perfectly linearly related
how to estimate r values
split scatterplot with means for each variable
count number of points in each quadrant
positive correlation will populate the positive quadrants more than the negative ones, and vie versa
how to set up to calculate r values
plot the raw values against one another
scaling problems - different means and SDs
we don’t care about means, SDs, units, only relationships
-plot z-transformed (standardised x and y values
no scalling or unit problems
r=… in words
the adjusted average of the product of each standardised x-y coordinate pair
how to report correlation
r(df)=r value,p=
limitations of correlation
it is not the same as causation
link may be coincidental or there may be a third variable involved
what is regression
a family of inferential statistical tests
tests of association
make prediction about data
used when causal relationships are likely
why cant we just use correlationi instead of regression
if interested in a causal relationship, you may be interested in how much to intervene
correlation does not give you that information
what does regression show us
unstandardised relationship between outcome (Y) and predictor (X) variables using calculations of the intercept (a) and gradient (b) expressed in the form Y=aX+b
if your predictor value in regression is 0, you can expect your outcome variable to equal..
a
assumptions in regression
linearity interval/ratio data normally distributed free of outliers homoscedasticity - residuals need to have the same degree of variation acorss all predictor variable scores
what are residuals
the difference between the actual outcome score and the predicted score outcome
opposite or homoscedasiticity
heteroscedasticity
problems with predictors when carrying out regression analysis
predictor variables are which are highly correlated with one another (show multicollinearity) are problematic
be cautious when interpretin multiple regression where predictor variable correlations >.80 (or
talk through the three graphical tests of homoscedaticity
histogram - bars should approximately fit the curve
scatterplot-points should follow along the diagonal
regression (standardised) scatterplot - points should form a non-descript cloud
how to report reression
check descriptives and correlations
check that predictor and outcome variables show a linear relationship
check that homoscedacity assumption is not violated
report the R^2 (proportion variance explained) in the text
report coefficients in a table
multiple regressions
predicting one outcome variable from more than one predictor variable
Y=a+b1X1+b2X2 etc
three ways we could carry out multiple regressions
order predictors entered in
- simultaneous = all predictors entered at the same time
- hierarchical = predictors are entered in a pre-defined order. used when regressions are informed by well-defined theory
- stepwise = predictors are entered in an order driven by how well they correlate with the outcome. not used as it is a relatively unstable method