WEEK 4 - Revision of descriptive statistics and intro to SPSS Flashcards
What are descriptive statistics?
- statistics simply to describe data collected
- To screen data and observe trends
What is inferential statistics?
- Use sample to infer something about the population
- Test whether a relationship/difference seen in a sample is sufficiently large to assume it may be real in the population
- Allows us to test hypothesis and make decisions based on sample data
How do you characterise a data set?
look at:
1. Central Tendency (mean, median, mode)
2. Variability (sum of squares, variance, SD, range, Standard error)
3. Shape (modality, skew, kurtosis)
Under a true (ideal) normal distribution, what would the mean, median and mode be?
The mean, median and mode would all have the same exact value
What is modality?
The number of central clusters a distribution possesses
*number of points on the photo**
- bimodal (scores vary around 2 central points)
- Unimodal (scores vary around 1 central point)
What is kurtosis?
Preakness (how tightly clustered scores are around the mean)
leptokurtic - small
normal - normal
platykurtic - large
What is skew
The symmetry of the tails of the distribution
What is the normality assumption?
- The distribution is unimodal
- The distribution has moderate peakiness
- The distribution has symmetric tails
What is the mean
The sum of a set of numbers divided by the number of numbers that you have
Why is the mean useful?
Tells us something useful about the centre of a data set
But does not tell us anything about the variability around the mean
Why is the Sum of Squares useful?
Tells us something about the total variability in the data set but does not really characterise the degree to which each participant varies around the mean
What does the SD tell us?
SD tells us the average amount of variability around the mean
Useful in telling us the degree of variability around the mean
What is the purpose of central tendency?
Provides an estimate of the level of performance in each condition
What is the purpose of variability?
Tells us how reliable our estimate (from the central tendency) is
What is the ‘golden rule’ regarding central tendency and variability?
The golden rule here is that a measure of central tendency without an accompanying measure of variability cannot be accurately interpreted
What is the formula for a statistic (in general, overall)
estimated effect size/ estimated erro
When do we consider something ‘significant’
If the Stat Value is sufficiently far from the centre of the probability distribution (ie. into the tails), we make a judgment that the Stat Value is significantly different from the mean
p< 0.5
p less than.05 is the criteria for what we consider to be significant
What is a z sore?
If we know the Mean (M) and the Standard Deviation (SD) of our data set, we can convert any score (X) to a Z-score simply by subtracting the mean, and then scaling (i.e. dividing) by the Standard Deviation
What does the z stat tell us
The Z stat is basically telling us how many Standard Deviations away from the Mean a particular score is
Z score hypothesis test
In effect we have been asking does this particular individual belong to/differ from a particular population (of which we know the mean and SD)
What is more generally the case is that we are asking questions about a group of people, where the population mean and SD may be unknown.
What is a single sample sample z test
If we want to compare the mean of a group of peoples scores (which is normally the case in an experiment) we need to compare this against a a distribution of group mean scores (a distribution of means)
What are the three types of distributions?
A) population
B) sample
C) distribution of means
The bigger your error variance the ___ your statistic will be q
smaller
What is central limit theorem
Basically says that the sampling distribution of the mean is itself a normal distribution it just has a much smaller error and smaller variability than the distribution of the scores itself
If we want to test a sample mean we should compare it to..
a distribution of sample means
What is the standard error of the mean Sm
SM is the sample SD divided by the square root of the number of observations in the sample
What is t?
When we estimate the population distribution, we no longer use the standard normal distribution (since one or more population parameter is unknown)
Instead, we use a special family of distributions called a
t distributions which are approximations of the Z distribution, but which changes shape according to the size of the degrees of freedom (n-1)
How are t - distributions different to z and normal distributions?
T distribution very similar to z or normal distribution BUT slightly different on for each sample size ….
Gets closer to normal as sample size increases….
Slightly more error involved as we have estimated population variance so slightly more of distribution in tails
The smaller the sample size, the smaller the df, the larger the critical t value that must be exceeded
What are the types of t-tests?
Single sample T-test
Repeated measures T-test
What is the difference between a Z and a T test
In a Z test, the population SD is known –>
In a T-test the population SD is unknown –> instead of population SD we use s which is the estimated population standard deviation
What is a single sample t -test
Where you just have a mean and you want to know weather its different to some hypothesised value –> could be the population mean or a number like 102
**want to know if the difference between your mean and that comparison value significantly greater than you would expect to find by chance
What is a repeated measures T-test
Same as a single sample t-test but for a repeated measure. So, compares someone’s scores from before and after (looks at the difference between before and after)
A repeated measures T-test is actually just a single sample T test on a set of different scores where the comparison value is just zero (zero because that means nothing changed)
So to do a repeated measure T-test , I would take my mean different score and I would subtract the hypothesised different score here, which is zero divide by my standard error of my mean and I would establish whether that was significant or not.