Descriptive Statistics Flashcards
What are the scales of measurement for numerical data?
Nominal data
Ordinal data
Interval scales
Ratio scales
What are the characteristics of nominal data?
Not really scales, more labels, names, or categories to which cases are assigned
Sometimes known as categorical data
Key characteristic: There is no implied ordering or numerical relationship between categories
When analysis nominal data what statistic could you use? And what does the analysis look at?
The analysis looks at relative frequency of cases in each category
You should use chi-squared
What are the key characteristics of ordinal data?
Categories to which cases can be assigned can be ranked or ordered in some way
However cannot assume that the differences between neighbouring categories are equal
What is the key characteristic of interval scales?
The categories to which cases can be assigned can be ordered in some way and we can assume that the differences between neighbouring categories are equal
However there is no true zero - the point is arbitrary
What are the key characteristics of ratio scales?
The categories to which cases can be assigned can be ordered in some way, we can can assume that the differences between neighbouring categories are equal, and there is a true zero
What are descriptive statistics?
They summarise the distribution with a few numbers which describe:
Central tendency - the middle of the distribution (same as average -mean, median)
Dispersion - the spread of scores (standard deviation)
What is the mode and what can it be used for?
It is the most common score
Only good for nominal data
Also good for summarising common incorrect answers to a test
What is the median and what can it be used for?
It is the middle score when the scores are in rank order
If there is an even number of cases then the media is halfway between the too middle values
What is the mean and what is it used for?
The mean is the arithmetic average - add up the scored and divide them by the number of scores
How to measure the standard deviation?
Calculate the mean
Subtract the mean from each score to get deviations
Square the deviations
Total the squared deviations
Divide the total by N-1 (this is the variance)
Take the square root of the variance to get the standard deviation
How would you work out the standard deviation when you have the Standard Deviation?
Once you have the mean and SD for a sample, subtract sample mean from score and divide the sample by SD
What are characteristics of non-parametric tests?
Relatively easy to calculate
Assumption free
Can use nominal and ordinal data
But they are not very powerful as they do not make use of information about the variability of the data
What is power?
The power of a test reflects how sensitive it is - more sensitive tests will detect a significant difference even if that different is quite small
The more powerful - the more likely to spot a significant result
What is a type 1 error?
A false positive
- tests says there’s a significant effect when really there isn’t
What is a type 2 error?
False negative
Tests says there’s no significant effect when really there is
Parametric tests are less likely to find one of these
What is a parametric test?
Parametric tests are more complicated to calculate but are more powerful
Uses more informative interval scale, takes advantage of measure of variability and the properties of the normal distribution
But data must meet certain parameters
What are some assumptions of parametric tests?
Interval or ratio scale data - interval between two measurements is meaningful
Scores must be normally distributed
Samples being compared must have similar variances
Aside- the tests are robust (can cope with some deviation from these ideal conditions)
What does a normal distribution graph look like?
Bell shaped
Symmetrical around the mean
Why study statistics?
People are so variable and our ways of measuring their performance are so error prone
What is a problem of error variability?
People (even the same person tested twice) will give different scores - people are complicated and they will be influenced by conditions not under our control
Because we expect people who are the sass to give us different scores, it is difficult to judge when it is meaningful or down to error variability
What is the main problem with error variability?
We need to be able to distinguish between error differences (variability) and meaningful differences
We only consider meaningful differences to be significant
How do you deal with error variability?
Try to eliminate it - good experimental design, matching participants, controlled environment, accurate measurement techniques, control confounding variables (leads to reduction in error)
Try to account for it - work out how big it is and take it into account
Error variability and the logic of statistical tests:
If we find a difference between groups or conditions, we need to know if this difference is an error or it’s a real difference so you would use the variability in a particular condition to estimate the size of error variability. Since everyone in the same condition should be the same, the variability must be due to error. We compare the variability between conditions (total variability) with within conditions (error variability) to decide if the difference is more than can be explained by error variability
What is variability is variability within a condition?
Error variability
What variability is variability between conditions?
Error variability and any effect of treatment
What does variability tell us?
How much scores in a distribution are spread out or clustered together
Estimating variability:
Standard deviation - the average amount that scores differ from the mean
SD gives a good indication of the shape of the distribution of the scores in the sample