Stats Test Flashcards
Sharon's Flashcards
What is an ANOVA
Analysis of Variance
When to use and ANOVA
When we are testing experiements that have 3 or more levels of independent variables (e.g., comparing a control vs caffeine in the morning vs caffeine at night)
Why don’t we use multiple t-tests
Type 1 error will increase
What does ANOVA produce
F-ratio
What is an F-ratio
Compares systematic variance to unsystematic variance
What can / can’t an ANOVA tell us?
It can tell use there was an effect but it cannot tell us what the effect was
How do we find out what the effect was when doing ANOVA
Planned comparisons or post-hoc tests
What is the bonferroni correction
A wat to control type 1 error by dividing the alpha (0.05) by the number of tests
This then sets the new p-value for a test to be significant
What is planned comparisons
A set of comparisons between group means that are constructure before any data is collected
this is theory led
and there is more power to these than post hoc tests
What assumptions need to be met when doing ANOVA
- Normal distribution
- Homogentiy of variances
- Sphericity
Tests of homogeneity of variances for independent ANOVA’s
Levene’s test
significant Levene’s = assumption of homogeneity of variance has been violated
Test of Sphericity for dependent ANOVAs
Mauchly’s test
Significant Mauchly’s = assumption of sphericity has been violated
Define homogeneity of variance
Assumption that the variance of one variable is similar at all levels of another variable
Define Sphericity
The difference taken from the same participatn / entity are similar
What is a one-way ANOVA
One independent variable will be manipulated
What is one-way independent ANOVA
Experiments with 3+ levels of the independent variable and different participants in each group
How to run a one-way ANOVA on SPSS
- Check Levene’s test - if significant then assumption of homogeneity of variances has been violated
- Between-group effects = SSm (variation due to the model aka experimental effect). To find the total experiment effect look at between-group sum of squares
- Within-group effects = SSr (unsystematic variation)
- To be able to compare between groups and within groups we look at the mean squares.
- Look at the F-ratio, if significant do post-hoc tests
What post-hoc tests do you run after a significant ANOVA (you want to control for type 1 error)
Bonferroni correction
What post-hoc tests to run after a significant ANOVA (you have very different sample sizes)?
Hochberg’s GT2
What post-hoc tests to run after a significant ANOVA (you have slightly different sample sizes)
Gabriel’s procedure
What post-hoc tests to run after a significant ANOVA (you have doubts about variance)
Games-Howell procedure (this one is a safe bet)
What is effect size
the magnitude of an effect
r
How to calculate effect size
R squared = SSm / SSt
Square root this to get effect size (r)
What is SSt
Total sum of squares
total amount of variation within our data
What is SSm
Model sum of squares
variation explained by our model
What is SSr
Residual sum of squares
variation not accounted for in our model
What is a two-way ANOVA
Two independent variables will be manipulated
How to run a two-way ANOVA on SPSS
- Check Levene’s tests - if significant then assumption of homogeneity of variances has been violated. If violated then transform your data or use a non-parametric test or report inaccurate F value
- Summary table will include an effect for each independent variable (aka main effects) and the combined effect of the independent variables (aka interaction effects)
- Bold items are the SSm, Error = SSr
- Look at the F-ration, if significant then complete post hoc tests
What is a repeated measures ANOVA
Three or more experimental groups with the same participants
How to run a repeated measures ANOVA on SPSS
- Check sphericity (equal variances between treatment
levels). If Mauchly’s test is significant then the assumption of
sphericity has been violated.
- If sphericity has been violated, we can look at either the
Greenhouse-Geisser Estimate, the Huynh-Feldt estimate or
the lowest possible estimate of sphericity (aka lower bound).
- Use Greenhouse when Mauchly’s is LESS than 0.75, use
Huynh when Mauchly’s is MORE than 0.75 - If the effect is significant, we need to look at ‘pairwise
comparisons’ to see where the effect lies.
- Look for significant values ie. less than 0.05 - Calculate effect size - use benchmarks of .10 / .30 / .50
When to use greenhouse-geyser and when to use Mauchlys
- Use Greenhouse when Mauchly’s is LESS than 0.75, use
Huynh when Mauchly’s is MORE than 0.75
What happens when we violate sphericity
violating sphericity = less power = increases type 2 error
What is a mixed ANOVA
Independent variables are measured using both independent and repeated measures groups
How to run mixed ANOVA on SPSS
- As mixed ANOVA uses both independent and repeated design
we need to check if assumption of homogeneity of variances AND
sphericity have been violated. - Look at both output tables and find the main effects (one for
each INDEPENDENT VARIABLE) and one interaction term. (words
in CAPITALS are your INDEPENDENT VARIABLEs you need to look
at these) - Look at the F-ratios in both tables.
- If the effect is significant then we can run t-tests to see where
the effect lies, make sure to use Bonferroni method (independent
variable alpha 0.05 by the number of tests you will run)
- Look at both ‘paired samples test’ tables. → this is known as a
SIMPLE EFFECTS ANALYSIS. - Calculate effect size - use benchmarks of .10 / .30 / .50
What is ANCOVA
sometimes we conduct research we know some factors have influence on our DVs (from previous research e.g., age and memory)
These factors are called covariates and we can include them in our ANOVA
Why do we use ANCOVA
to reduce the error variance (increase how much variance we can explain)
eliminate confounds (by including the covariates we remove the bias of these variables)
How to run ANCOVA on SPSS
- Check Levene’s test of homogeneity of variances.
- If significant, transform the data of complete a non-
parametric test. - The output will look the same, it will just include the
covariates. - Look at the F-ratio for all the main effects and for
the covariates.
- If the covariate is significant, this means that it has
a relationship with our main independent variable. - Calculate effect size - use benchmarks of .10 / .30
/ .50
What is MANOVA
Multivariate analysis of variance
ANOVA but when there are several dependent variables
How to run MANOVA on SPSS
- Check for independence, random sampling, multindependent
variableariate normality and homogeneity of covariances matrices.
- If Box’s test is significant then the assumption of homogeneity of
covariances matrices has been violated. - Look at the multindependent variableariate test ‘group’ table. This is
showing the effect of the Independent variable on the DV. - When looking at the output Pillai-Bartlett test (Pillai’s trace) statistic
is the most robust. - If there is a significant F ratio then we need to look at the
unindependent variableariate tests or run a discriminant analysis.
How to interpret unindependent variableariate test statistics? -
Levene’s should be non-significant
- then look at ‘tests of between-subjects effects’ → corrected model
and group row stats should be significant if there is an effect between
IVs and DVs.
How to interpret discriminant analysis
Look at the ‘covariance matrices’ to see the direction and
strength of the relationships
Eigenvaues percentage of variance = variance accounted
for, square the canonical correlation to use as an effect size.
Wilks’ Lambda table shows significance for all variables,
look for the significant ones.
Use the Standardised Canonical Discriminant Function
Coefficients table to see how the DVs have contributed.
Scores can range between -1 - 1, high scores = variable is
important for the variate. Look down the ‘function 1’ column,
if one value is positindependent variable and the other is
negatindependent variable then the variate (aka function)
has discriminated the two groups
What is power analysis
The ability for a test to find an effect is known as statistical power
What is power of a test
Power of a test = the probability that a test will find an effect if there is one
We aim to achieve a power of 0.8
Power of a statistical test depends on
- how big the effect is
- how strict we are with our alpha level (i.e., 0.05 or 0.01)
- How big the sample size is - the bigger the sample size, the stronger the power
What are confidence intervals
A range of values that are believed to contain the true population value
eg. a 95% confidence interval means that if
we were to take 100 different samples and
compute a 95% confidence interval for each
sample, then approximately 95 of the 100
confidence intervals will contain the true
mean value
How to interpret confidence intervals
- If 95% CI do not overlap = means come from
different populations. - CIs that have a gap between the upper and
lower end of another - p <0.01 - CIs that touch end to end - p = 0.01
- CIs that overlap moderately - p = 0.05
What are common effect sizes
Cohen’s D
Pearson’s correlation coefficient r
odds ratio
What is Cohen’s d
The difference between two means divided by the SD of the mean of the control group, or a pooled estimate based on the SD of both groups
What are the benchmarks for Cohen’s D
small d = 0.2
medium d = 0.5
large d = 0.8
What are the benchmarks for Pearson’s correlation coefficient
small r = 0.1
medium r = 0.3
large r = 0.5
0 = no effect, 1 = perfect effect
What does an odds ratio of 1 mean
the odds of an outcome are equal in both groups
How to calculate the odds ratio
calculate by dividing the probability of the event happening by the probability of it not occurring
What is categorical data
Data which can be divided into groups (e.g., gender, age group)
How to analyse categorical data
Pearson’s chi squared test
The likelihood ratio
Yates continuity correction
Log linear analysis
when to use Pearson’s chi squared test
when we want to see if there is a relaitonship between two categorical variables
if the expected frequency is less than 5 then we need to use Fisher’s exact test
When to use th likelihood ratio
to be used instead of chi squared test when samples are small
When to use Yate’s continuity correction
When we have a 2x2 contingency table then type 1 error increases
Yate’s continuity correction fixes this by lowering the chi squared statistic
What is a 2x2 contingency table
2 variables with two level e.g., males vs female / phone vs no phone
When to use log linear analysis
When there are 3+ categorical vairbales
What are the assumptions when analysing categorical data
independence of residuals (as such you cannot use chi squared on repeated measures)
expected values: should not be less than 5
When to use chi-squared test
use a chi-squared test if you have nominal (categorical) data
the chi squared test can be used to see if these observed frequencies differ from those that would be expected by chance
Types of chi squared test?
Chi squared goodness of fit test (one IV)
Chi squared as a test of association (Two IVs)
When to use chi-squared good ness of fit
Used to compare an observed frequency
distribution to an expected frequency
distribution.
- Eg. when picking fruit are people more
likely to pick an apple vs a banana.
- If significant then, some fruit get picked
more than we would expect by chance.
When to use Chi squared as
a ‘test of association’ (two
independent variables)?
Used to see if there is an association
between two independent variables.
- Eg. is there an association between
gender and choice of fruit.
- If significant then, there is an association
between the two variables.
What is additivity and linearity
the outcome variable is linearly related to predictors
What are the parametric test assumptions
At least interval data
Additivity and linearity
Normally distributed
Homoscedasticity/homogeneity of variance
Independence
What is homoscedasticity / homogeneity of variance
Variance of the outcome variable
should be stable at all levels of the
predictor variable.
What is independence
errors in the model should be dependent
How to spot issues with assumption of normality
- look at the histogram (it should look like a
bell curve)
-look at the p-p plot (dots should fall
on/near the line) - Look at descriptive statistics (skewness
and kurtosis should be near to 0)
How to spot issues with
assumption of
linearity/homoscedasticity/
homogeneity of variances?
Look at scatter plots
Look at Levene’s test - significant =
variances unequal = assumption of
homogeneity of variances has been
broken.
What does a scatterplot look like when data is normal
dots scattered evenly everywhere
What does a scatter
plot look like when data
= heteroscedasticity?
funnel shape
what does a scatter plot look like when data is non-linear
curve
What does a scatter plot look like when data is non-linear and heteroscedasticity
curve and funnel (e.g., a boomerang)
Non-parametric alternatives to ANOVAs
kruskal-wallis
Friedman’s ANOVA
Non parametric alternative to one-way independent ANOVA
Kruskal-Wallis
Non parametric alternative to repeated measures ANOVA
Friedman’s ANOVA
How to interpret the
Kruskal-Wallis test?
- Look at the ‘ranks’ table, the mean ranks tell us which
condition had the highest ranks - If the chi squared test is significant then there is a difference
between groups (but we do not know what kind of difference) - To see where the difference lies, look at the box-plot and
compare the experimental group to the control group. - OR we can do a Mann-whitney test and use Bonferroni
correction (divide alpha by the number of tests), look to see
which conditions are significant. - Calculate the effect size by dividing the z score by the
number of obvs square rooted.
- use benchmarks of .10 / .30 / .50
What is an alternative to
the one way repeated
measures ANOVA?
Friedman’s ANOVA
How to interpret the
Friedman’s ANOVA?
- Look at the ‘ranks’ table, the mean ranks tell us which condition
had the highest ranks. - If the chi squared test is significant then there is a difference
between groups (but we do not know what kind of difference) - To see where the difference lies, look at the box-plot and
compare the experimental group to the control group. - OR we can do a Wilcoxen test and use Bonferroni correction
(divide alpha by the number of tests), look to see which conditions
are significant in the ‘test statistics’ box. - Calculate the effect size by dividing the z score by the number of
obvs square rooted.
- use benchmarks of .10 / .30 / .50
What are
correlations?
relationships between variables
define covariate
If variables are related, then change in one variable will lead to similar change in another variable
What is cross-product deviation
the similarities / differences of the deviation
How to calculate cross product deviation
multiple the deviations of one variable but the deviations of the other variable
How to calculate
covariance?
Calculate the cross product
deviation and divide by the number
of observations - 1.
If covariance is positive, then the correlation will be…..
positive
what is the standardised version of the covariance
correlation coefficient
what is the correlation coefficient
the standardised version of the covariance
Pearsons correlation coefficient measures the strength of relationship between variables
how to calculate a correlation coefficient
Divide covariance by standard
deviation.
Scores lie between -1 and +1
+1 perfect positive relationship
What is bivariate correlation
correlation between 2 variables
what is a partial correlation
quantifies relationship between two variables while controlling the effect of other variables
How to run bivariate correlations on SPSS
- Have assumptions been violated? If they
have use Kendall’s tau / Spearman’s rho - Look at ‘correlations’ table and see if
Pearson’s correlation are significant. - Look at the confidence intervals (if the
data is not normal, look at the bootstrap
CI)
- If the confidence interval crosses zero it
suggests there could be NO effect.
What is the coefficient of determination
The coefficient of determination is
represented by the term r2 (or R2) it is
the percentage of the total amount of
change in the dependent variable (y)
that can be explained by changes in the
iv (x).
How to calculate R^2
Square the correlation coefficient (R)
What correlation coefficient test should you use if the data is non-parametric
spearman’s rho
How to interpret spearman’s rho on SPSS
- Look at ‘correlations’ table and see if
correlation coefficient is significant. - Look at the confidence intervals (if the
data is not normal, look at the bootstrap CI) - If the confidence interval crosses zero it
suggests there could be NO effect.
What correlation coefficient test should you use if the data is non-parametric and sample is small
Kendall’s tau
how to interpret Kendalls tau on spss
- Look at ‘correlations’ table and see if
correlation coefficient is significant. - Look at the confidence intervals (if the
data is not normal, look at the bootstrap
CI) - If the confidence interval crosses zero
it suggests there could be NO effect.
What type of correlation is used when one of the two variables is dichotomous
point-biserial correlation
when is point biserial correlation used
when one variable is discrete dichotomy (e.g., pregnancy)
when is biserial correlation used
used when one variable is a continuous dichotomy (e.g., passing or failing an exam)
what is a semi-partial correlation
we control for the effect that the third variable has on only one of the variables in the correlation
what is the equation of the simple linear model
Outcome = (b0 + b1X) + error.
what is the simple linear model used for
we can predict an outcome for a person using the model (the bit in brackets) and some error associated with this model
in the linear model what is b0
intercept
in the linear model what is b1
slope / gradient
what are b0 and b1 in the simple linear model
parameters
regression coefficients
in the linear model, what does a positive b1 mean
positive relatiomship
what are the assumptions of the linear model
Normally distributed errors
Independent errors
Additivity and linearity
Homoscedasticity
What is additivity and linearity
outcome variable and predictors combined effect is best described by addition effects together
how can we check independent error
durbin-watson value should be between 1 and 3
how big should our sample be when using the linear model
10 or 15 cases of data per predictor
What is cross validation
assessing the accuracy of a model over different samples
look at the adjusted R squared
how to inter-reset simple linear regression in SPSS
- Look at the ‘model summary’ R represents
correlation
- R squared represents the amount of variance
accounted for by the model. - Look at the ‘ANOVA’ table, if the F ratio is
significant then our model is a better predictor in
comparison to using the mean.
3.
‘B’ in ‘coefficient’ table tells us the gradient and
the strength of the relationship between a predictor
and outcome variable. Significant means the
predictor significantly predicts the outcome variable.
what does R squared represent
the amount of variance accounted for by the model
what is multiple regression
a model with several predictors
what are the different methods of regression
hierarchical regression
forced entry
stepwise methods
what is hierarchical regression
predictors are based on past work and the researcher decides which order to enter the variables
known predictors should go first, followed by any new ones that we suspect will be important
what is forced entry regression
all predictors are forced into the model at the same time
we have to have good theoretical support to include the predictors we have included
what is stepwise methods regression
Generally frowned upon because the
researcher is not in control.
The decision is based on mathematical criterion
that SPSS decides.
It will see how much variance is a accounted
for by one predictor, if it is sufficient then it will
keep it and move on to find another predictor
which may explain more variance.
what are the concerns when including more than one predictor in a model
multicollinearity
what is multicollinearity
exists when there is a strong correlation between 2+ of our predictor variables
why is having 2+ variables with perfect collinearity problematic
the values of b for each variable are interchangeable
what happens when collinearity increases
standard errors of the b coefficient increase
the size of t is limited
it is difficult to assess the individual importance of predictor when they are highly correlated
what is r
the correlation between predicted values of the outcome and the observed values
how can we check to see if multicollinearity is a problem
check is the variable inflation factor (VIF) and tolerance statistics
how to interpret VIF
VIF greater than 10 = cause for concern
VIF greater than 1 = regression may be biased
how to interpret tolerance statistic
tolerance below 0.1 = serious problem
tolerance below 0.2 = potential problem
what’s cooks distance
quantifies the impact of an outlier on a model
if cooks distance is above 1, then that case may be influencing the model
what is factor analysis
a statistical procedure that identifies clusters of related items on a test
do several facets reflects one variable (e.g., burnout (variable) - stress levels, motivation (facets))
what do factors represent
clusters of variables that correlate highly with each other
what cane we use to decide which factors to extract
scree plot
where the inflexion is where you should cut off
what is rotation used for in factor analysis
to discriminate factors
what is ethics comprised of
informed consent
deception
debriefing
confidentiality
protection from physical and psychological harm
what is informed consent
participants should understand what the experiment involves and understand their rights
the ability to withdraw at any point
when is it ok to not gain informed consent
in observational studies only if the person being observed is in a situation where they would be in public view anyway (e.g., shopping centre)
what are the levels of measurement
Nominal
ordinal
interval
ration
which levels of measurement use non-parametric tests
nominal
ordinal
which levels of measurements use parametric tests
interval
ratio
what is nominal data
the numbers act as a name
data from a nominal scale should not be used for arithmetic
nominal data can be used for frequencies
what is ordinal data
tell us the frequencies and in what order they occured
does not tell use the differences between values
most self report questionnaires are ordinal data
what is interval data
differences between values on a scale are equal
tested with parametric statistics
what is ratio data
differences between values on a scale are equal
distances along the scale are divisible
there is a true zero point (i.e., no minus numbers, e.g., reaction time)
types of variables
discrete
continuous
what are discrete variables
non-overlapping categories
eg being pregnant - you either are or are not
what are continuous variables
runs along a continuum
e.g, agression
what is validity
whether an instrument measures what it sets out to measure
what is criterion validity
whether you can establish if a measurement is measuring what it is meant to through comparison to an objective criteria
we assess this by relating scores on your measure to real-world observation
what is concurrent validity
evidence that scores from an instrument correspond to external measures
eg. nurses are assessed for knowledge
via a written & practical test. If they
score well on the test and then well on the
practical = concurrent validity.
what is predictive validity
when data from the new instrument are used to predict observations later in time
what is content validity
with questionnaires, we can assess how well individual items represent the construct being measured
what is factorial validity
when making questionnaire and using factor analysis
if your factors are made up of items that seem to go together meaningfully = factorial validity
what is reliability
whether an instrument can be interpreted consistently across different situations
what is test-retest reliability
the ability of a measure to produce consistent results when the same entities are tested at two different points in time
how can we test reliability
split-half method
cronbach’s alpha
what is split-half method
Splitting a test into two and having the
same participant do both.
The results are then correlated, and if
they are similar then there is high
internal reliability.
how can we infer reliability through Cronbach’s alpha
if the correlation is above 0.8 = reliable
what is measurement error
The difference between the score we
get using our measurement and the level
of the construct we are measuring.
Eg. I actually weigh 47kg but the scales
show 57kg
what is a histogram
Used for frequency distribution.
Plots a single variable (x-axis)
against the frequency scores (y-axis)
what is a box plot
Used to show important characteristics of a set of
observations.
Center of the plot = median
Box = middle 50% of observations (aka interquartile range)
Upper and lower quartile are the ends of the box.
Whiskers = top and bottom 25% of scores.
what is a bar chart used for
graphing means
what are scatterplots
used for graphing relationships
a graph that plots each persons score on one variable against another
what is null hypothesis significant testing
A method of assessing scientific theories
We have 2 competing hypothesis - null
hypothesis (no effect) and the alternative
hypothesis (there is an effect).
We compute a test statistic and find out the
likely it is that we would get a value as big as
the one we have if the null hypothesis is true
(ie. by chance).
in null hypothesis significance testing, what is a significant effect
less than <0.005 = significant effect
what is type 1 error
saying there is an effect when there isn’t
rejecting the null hypothesis when it is true
what is a type 2 error
saying there isn’t an effect when there is
accepting the null hypothesis when it is false
what is the power of statistical test
probability we will find an effect if it exists
what is a meta-analysis
effect sizes from different studies testing the same hypothesis are combined to get a better estimate of the size of effect in the population
what is an alternative to null hypothesis significance testing
bayesian analysis
what is an IV
The variable that is being manipulated
by the researcher.
It is independent from the other
variables.
IV’s can have different levels.
IV goes on the x axis.
what is the DV
The variable that is hypothesised to
be affected by the IV.
It depends on the IV.
DV goes on the y axis.
what is between group design
participants placed into different groups
they can be part of one group for the entire experiment
what is a within groups design
same participants placed into all levels of the independent variable
what is quantitative research
research that deals with numerical data
data is analysed to compare groups or make inferences
confirm / test hypotheses using numbers
what is qualitative research
mainly uses words
data analysed to summarise, categories and interpret themes
explorative, an attempt to understand through words
what is descriptive research
aims to describe a phenomenon
what, when, where and how
does not lead us to think about causation
what is correlational research
aims to define a statistical relationship between variables
e.g., is there a relationship between cognition and caffeine
what is quasi-experimental design
experimenter has no control over the allocation of pots to conditions or the timing of experimental conditions
what is experimental design
aims to establish causality
randomisation in important to reduce the effect of confounding variables
what is ABA design
baseline behaviour measured (A)
treatment applied and behaviour measured while treatment present (B),
treatment is removed and the baseline behaviour is recorded again (A)
types of sampling in qualitative research
purposive sampling
theoretical sampling
what is purposive sampling
selecting participants according to criteria is important for the research question
what is theoretical sampling
the people you attempt to recruit will change as a result of the things you are learning
what is meant by 2x2 design
a research design with 2 independent levels, each with 2 levels
how can we examine mediated or moderated relationships
path analysis
what is factor loading
a correlation coefficient between a variables and a factor (cluster of variables)
what is a mediating variables
explained relationships between independent variables and dependent variables
what is a moderating variables
alters the relationship between the IV and DV
what is the standard error
the standard deviation (spread) of the sampling distribution
what is standard deviation
a measure of how much scores vary around the mean score
what is variance
how the values are dispersed around the mean
what are z-scores
Number of units of standard deviation
any one value is above or below the
mean
The larger the z-score the further its
value is away from the group’s mean
how to calculate z-scores
(raw score - mean) /
standard deviation
what does a significant F-ratio tell you
the model is a better predictor in comparison to the mean
what is the p-value
the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct
what are the degrees of freedom for one sample t-test
sample size - 1
what are the degrees of freedom for one way ANOVA
sample size - k
where k is the number of cell means
what is the ceiling effect
when scores tend to cluster at the upper end of a distribution
what is the floor effect
when a task is so difficult that all scores are very low
what is a pairwise comparison
post hoc compares two individual means at a time
what is a main effect
effect of one IV while ignoring the other IV
what is an interaction effect
the combined effect of two or more IV’s