Stats Flashcards
Frequency
Number of times each score occurs.
Normal distribution
Most scores gravitate towards the mean score with few deviating outliers.
Recognisable by: bell-shaped curve
Positively skewed distribution
Frequent scores are clustered at the lower end.
Recognisable by: a slide down to the right.
Negatively skewed distribution
Frequent scores are clustered at the higher end.
Recognisable by: a slide down to the left.
Platykurtic distribution
A wider spread of high scores.
Recognisable by: thick “platypus” tail that’s low and flat in the graph.
Leptokurtic distribution
High scores are extremely centralised and obviously close to the mean.
Recognisable by: skyscraper appearance - long and pointy.
Mode
Most common score.
If there are 2 equally common scores it is bimodal and there can be no mode.
Can use mode for nominal data.
Disadvantages of mode
1) Could be bimodal (or multimodal) and give no true mode (e.g. 3/10 and 7/10 are opposites but bimodal).
2) A mode can be changed dramatically if one single case is added.
Median
Central point of scores in ascending data. Middle number in odd number of cases. If even - mean of 2 central numbers.
+ Relatively unaffected by outliers
+ Less affected by skewed distribution
+ Ordinal, interval, and ratio data
- Can’t use on nominal
- Susceptible to sampling fluctuation
- Not mathematically useful
Mean
Add all scores and divide by total number of scores collected.
\+ Good for scores grouped around central \+ Interval and ratio data \+ Uses every score \+ Can be used algebraically \+ Resistant to sampling variation - accurate
- Outliers that are extreme
- Affected by skewed distributions
- Not used for nominal and ordinal
Range
Subtract smallest score from largest.
+ Good for cluster scores
+ Useful measure of variability for nominal & ordinal data
Symbols
x = score x̅ = mean x-x̅ = deviation (d) ∑ = sum N = number in a sample s² = variance of a sample s = standard deviation
Accuracy of the mean
Hypothetical value - doesn’t translate to real values (i.e. 2.5 children)
1) Standard deviation
2) Sum of squares
3) Variance
Total error
Worked out by adding all the deviations.
Deviation
Observed value - mean
Negative score = overestimate for this participant
Positive score = underestimate
Sum of squared errors (SS)
Square all deviations so they become positive.
Add them together to make a sum of squares.
The higher the sum of squares, the more variance in the data.
More variance = less reliable
Standard deviation (σ)
A measure of spread - is it statistically significant or expected variance?
Anything within standard deviation value from mean would be expected variance. Anything outside 2 standard deviations is statistically significant.
Same as SS / x̅. Square root the result.
Sampling distribution
The frequency distribution of sample means from the same population.
Standard error of the mean (SE)
The accuracy with which a sample reflects the population. Measured by deviation from the mean.
Large value = different from the population
Small value = reflective of population
Confidence interval
If we can assess the accuracy of sample means, we can calculate the boundaries within which most sample means will fall. This is the confidence interval.
If it represents the data well, the confidence interval of that mean should be small.
Descriptive statistics
Shows what is happening in a given sample.
Inferential statistics
Allows us to make assumptions based on the information we have analysed.
At what probability value can we accept a hypothesis and reject a null hypothesis?
0.05 or less.
Type 1 error
When we believe our experimental manipulation has been successful when it is actually due to random errors. E.g if we are accepting 5% as significance value and repeated the experiment 100 times, we would still have 5 times we get statistical significance that is due to random error.
Type 2 error
Accepting the difference found was due to random errors when it was actually due to the independent variable.
Effect size
An objective and standardised measure of the magnitude of the observed effect. As it is standardised, we can compare effect sizes across different studies.
Pearson’s correlation coefficient (r)
Measures the strength of a correlation between 2 variables. Also a versatile measure of the strength of an experimental effect. How big are the differences (the effect)?
0 = no effect 1 = perfect effect
Cohen’s guidelines to effect size
r = 0.10 (small effect): explains 1% of total variance.
r = 0.30 (medium effect): 9%
r = 0.50 (large effect): 25%
N.B. Not a linear scale (i.e. .6 is not double .3)
Why use effect size?
To show the level of significance of the effect that we are observing - how significant is the [p < .05] significance?
Is not affected by sample size in the way that p is.
Properties linked to effect size:
1) Sample size on which the sample effect size is based.
2) The probability level at which we will accept an effect as being statistically significant.
3) The power of the test to detect an effect of that size.
Statistical power
The probability that a given test will find an effect, assuming one exists in the population. Can be done before a test to reduce Type II error.
Must be 80% or higher.
What assumptions need to be met for parametric test
1) Data measured at interval or ratio level.
2) Homogeneity of variance (Levene’s test)
3) Sphericity assumption (Mauchley’s test)
4) Sample should form a normal distribution.
Checks for normal distribution
1) Plot a histogram to see if data is symmetrical.
2) Kolmogorov-Smirnov test or Shapiro-Wilk test.
3) Mean and median should be less than half a standard deviation different.
4) Kurtosis and skew figures should be less than 2x their standard error figure.
Kolmogorov-Smirnov or Shapiro-Wilk
Compares your set of scores with a normally distributed set (with the same mean and standard deviation)
We do not want our data to be significantly different from the normal set if p< 0.05
Homogeneity variance
Individual scores in samples vary from the mean in a similar way.
Tested using Levene’s test.
Levene’s test
Measures homogeneity variance in order to tell if the individual scores on the samples vary from the mean in a similar way.
An assumption for a parametric test.
If unequal group sizes, we must run additional tests: Brown-Forsythe F and Welch’s F adjustments.
T-test
The difference between means as a function of a degree to which those means would differ by chance alone.
Independent t-test
An experiment on two groups in a between subjects test to see if the difference in means is statistically significant.