1: INTRODUCTION Flashcards
categorical scales of measurement
nominal:
- numbers or names serve as labels
- but no numberical relationship between values
discrete or continuous scales of measurement
ordinal: e.g., race position
- data is organised by rank
- values represent true numerical relationships
- but intervals between values may not be equal
interval: e.g., shoe size
- true numerical relationships and intervals between values are equal
- but scale has no true 0 point
ratio: e.g. distance
- true numberical relationships, equal intervals and true zero point
when do we use the mean to measure central tendency?
discrete or continous data which is normally distributed
measure of spread - standard deviation
when do we use median to measure central tendency?
discrete or continuous data which is not normally distributed
measure of spread - range
when do we use mode to measure central tendency?
categorical data
when can we make claims about causality?
only if we have controlled for confounding variables
- using random allocation, counterbalancing etc.
- not always possible (quasi experimental designs)
True-experimental IVs
- IVs are actively manipulated
- random allocation is possible
- e.g., sport context (2 levels: solo, competitive)
- e.g., treatment group (3 levels: placebo, drug, counselling)
Quasi-experimental IVs
- IV reflects fixed characteristics
- random allocation is not possible (must be cautious about implying causality)
- e.g. handedness (2L: right, left)
- e.g. age (3L: 18-20yr, 20-22yr, 22-24yr)
between-subjects design
independent groups
- participants exposed to only one IV level
- e.g. intervention vs. control
within-subjects design
repeated measures
- participants exposed to all IV levels
mixed designs
at least one IV is between subjects AND at least one IV is within subjects
kurtosis
the sharpness of hte peak of a frequency distribution curve
Sharpest to least sharp:
- leptokurtic: small sd
- mesokurtic
- platykurtic: large sd
skew
positive skew - to the left (y axis)
negative skew - to the right (away from y axis)
bimodal distribution
bell shaped
BUT
2 peaks
not normally distributed (don’t use parametric)
uniform distribution
all values are the same (appears like a block)
not normally distributed (don’t use parametric)
population and sample parameters
PP: the true value
μ = …, 𝜎 = …
SP: the estimate
x (line on top), s = 8.19
sample error
degree to which sample statistics differ from underlying population parameters
minimising error:
- representative (random selection)
- sufficient in size
Z-scores
converted from a normally distributed population
Z= (x-𝜇) / 𝜎
95% of values lie within +- 1.96 standard deviations of the mean
sampling distribution
distribution of a statistic across an infinite number of samples (e.g., sampling distribution of the mean)
if the mean of each sample is plotted, infinite samples will form a normal distribution
- the mean of the sampling distribution of the mean is equivalent to the population mean
- the standard deviation of the sampling distribution of the mean is called - the standard error
standard error
- the standard deviation of the sampling distribution
- a function of sample size
- SE decreases as sample size increases (sampling error decreases as sample size increases)
SE = 𝜎 / -/n
(standard deviation / square root n)
estimated standard error
- sampling distributions are theoretical, we never know real SE
- ESE is an estimate of standard error, based on our sample
ESE = s/ -/n
confidence intervals (CIs)
we use statistics to estimate population parameters
- x (line over) is a single point estimate of 𝜎
- estimates are subject to sampling error
CIs are interval estimates of population parameters (usually 95%)
We are declaring that there is still a chance that our estimates are wrong
finding the population mean (CIs)
- if there’s a 95% chance that the sample mean falls within the 95% bounds of the population mean it follows that:
- theres a 95% chance that the population mean falls within the 95% CIs of the sample mean
calculating confidence intervals
t-distribution
- spread of scores vary accoring to sample size
to calculate 95% CIs
- look for critical value of t where 2.5% of scored are higher/lower (t^0.975)
- 95% CIs around X(line):
x(line) +_ t^0.975 * ESE
NB. where n>1000, t^0.975) = 1.96
null hypothesis (H^0)
there is no difference between the population means
- start by assuming this is true
if we find a difference between the sample means, we ask:
- what is the chance of measuring a difference of that magnitude if the null hypothesis is true
P-values
p-value: the probability of measuring a difference of that magnitude if the null hypothesis is true
a (alpha) : threshold level of probability where we will be willing to reject the null hypothesis
- typically a = .05
if P < a (or equal) we reject the null hypothesis
type I error
the null hypothesis is true
we reject the null
type II error
the null hypothesis is false
we fail to reject the null