Lecture 4 - Probability, Sampling and Distributions Flashcards
Define probability theory
The branch of mathematics concerned with the study of random phenomena, i.e. chance.
Using the Gaussian equation, we can predict the value of y for any value of x from just the…
Mean and standard deviation
A positive skew moves the data peak of a normal distribution to the…
Left, and vice versa for a negative skew
Pearson’s coefficient of skew uses… to…
The difference between the mean and median… Measure the skew in terms of both magnitude and direction (positive or negative)
When data is positively skewed, i.e. the tail is on the right of the mean, mean>…
Median>mode
When data is negatively skewed i.e. tail is on the left of the mean, mean<…
Median<mode
What percentage of the sample are within 1 s.d. of the mean?
68%
What percentage of the sample is within 2 s.d.s of the mean?
About 95%
What percentage of the sample is within 3 s.d.s of the mean?
About 99.7%
Parametric tests assume that the mean and standard deviation…
Accurately represent the population distribution
Data can be transformed by…
Performing a mathematical operation (s) on all the values recorded
Data transformation is useful for…
+ reducing the impact of outliers/skew
+ standardisation, e.g. z scores are a consistent, universal unit
+ to remove non-linear effects
+ theoretical - using different measures to better understand the data
+ making the data normally distributed so that parametric tests can be used.
The z score is calculated by…
Taking the mean from each score and dividing the result by the standard deviation
The z score tells us…
How many standard deviations we are above or below the mean (0)
Sampling error is the difference…
Between the mean of each sample and the true mean of the population (and other sample means)
The standard error of a statistic tells us…
How much how much that statistic is likely to vary from one sample to another
The standard error of the mean is…
A measure of how confident we are that we know the true population mean, calculated by dividing the standard deviation by the square root of the sample size.
S.e.m. is dependent on…
- The standard deviation of the population (variability of the original data) - smaller = more representative, higher confidence level
- The number of data used to create the sample mean - larger sample size = more representative, higher confidence level
Confidence intervals are often used as an alternative to…
S.e.m.
A confidence interval of a certain percentage indicates…
The likelihood of that range containing the true mean
Confidence intervals are computed by…
Multiplying the s.e.m. by the s.d. values you want it to be between, e.g. for a 95% c.i. multiply by +-1.96
Error bars can be…
Either 95% c.i.s or s.e.m.s.
The normal distribution, using various methods, allows us to determine:
- the probability of a certain score or range of scores occurring
- the probability that the population mean falls within a certain range
- the probability that populations are different, e.g. through error bar differences