Distributions and Probability Flashcards
What is normal distribution?
When data is symmetrical around central scores
Mean, median and mode are equal
Data should fit along a “Gaussian curve”
How do you calculate Pearson’s coefficient of skew?
3(Mean - Median) / standard deviation
What does it mean if Pearson’s coefficient of skew is <0?
Negatively skewed
What does it mean if Pearson’s coefficient of skew is >0?
Positively skewed
What does a Gaussian curve mean?
From mean and standard deviation of data alone, can predict y value for any x value
What do parametric tests assume?
Normal distribution
Values such as mean and standard deviation accurately reflect population distribtuion
What are examples of statistical tests?
t-test
- one-sample
- independent
- paired
ANOVA
- factorial
- one-way
Correlation
Regression
What does a normal distribution say about percentages of population within certain standard deviations?
68% of population within (mean +/- 1 * SD)
95% of population within (mean +/- 2 * SD)
99.7% of population within (mean +/- 3 * SD)
What does transforming data into z scores do?
Helps standardise data and reduce impact of skewness
Can tell us how many standard deviations someone was from the mean
What are the pros of z scores?
Can transform data to standardised scale
Scale adheres to normal distribution
Can compare things relative to their own population
Use the entire dataset
What is the standard error?
Tells how likely it is that our sample will vary from one sample to another
How confident are we that we know the true population mean?
Use SE of the mean to say how confident we are that our sample values represent the population
Smaller = better
What are the largest influences on the standard error?
Variability of original data (standard deviation of population)
Total N used to create sample mean
What is a confidence interval?
Range of values that, in a certain proportion of the samples, contain true value of a statistic (e.g. mean)
Can be used for visualisation - error bars
What are error bars?
Can be SD, SEM or CI but must be explicit which one is being used
Non-overlapping SEM bars often imply significant differences between conditions