Module 4 Flashcards
Normal distribution and t-tests
What is a normal distribution?
Numerical (continuous) probability distribution that:
1) Is symmetrical around the mean
2) Has one mode (mode = median = mean)
3) Has lower probability densities of values further from the mean
What are the characteristics of a normal distribution?
1) It is defined by the mean and the standard deviation
2) Area under the curve = 1
3) Probability measured as area under the curve (lower bound-upper bound)
4) 2/3 of the area lies within 1 SD of the mean
5) 95% of the area lies within 1.96 SD of the mean
What is the normal distribution of means?
If a variable, Y, has a normal distribution, then the sampling distribution of means also has a normal distribution
What is a standard normal distribution?
A normal distribution where the mean = 0 and SD = 1
What is a standard normal deviate (Z)?
A measure of how far a particular value (Y) is from the mean, given in standard deviations
What is a standard normal table used for?
To determine the probability of a randomly selected value being above a given cutoff value
What is the central limit theorem?
The sampling distribution of means is approximately normal even when the variable itself is not normally distributed, given the sample size is large enough
What is the t-distribution and how is it useful?
1) The Z-standardization method to calculate the probability of mean values requires knowledge of population standard deviation (rare)
2) t-distribution uses standard error of the mean as an estimate of true standard error
How is the t-distribution different from the standard normal distribution?
1) The standard error of the mean is variable and depends on the sample taken (sampling error)
2) t-distribution has fatter tails
How is the degrees of freedom calculated for the t-distribution?
df = (sample size) - 1
What is the one-sample t-test used for?
To compare the mean of a sample with a value for the population mean under the null hypothesis
What are the assumptions of t-tests?
1) Random sample
2) Variable is normally distributed
What is the paired t-test used for?
To compare the mean differences between paired measurements to a value for the population mean difference under the null hypothesis
What are the characteristics of a paired t-test?
1) Special case of the single-sample t test where 2 treatments are applied to each sample unit and
2) Paired measurements are converted to single measurements by calculating difference between the pairs (therefore n = # of pairs)
What is the difference between paired and two-sample t-tests?
1) Paired t-test has 2 treatments applied to every sampled unit
2) 2-sample tests have 2 independent samples, where each receive a different treatment
Which design is better to use: Paired design or 2-sample design?
Paired design because there is more control over extraneous variation (higher power)
What is the 2-sample t-test used for?
1) 2 treatments are applied to separate, independent samples from 2 populations
2) Commonly used to determine if the means of 2 populations are equal
How do you calculate the degrees of freedom for a 2-sample test?
df = df1 + df2
What are the assumptions of a paired t-test?
1) Randomly sampled pairs
2) Difference between pairs are normally distributed (not the individual variable itself)
What are the assumptions of the 2-sample test?
1) Random sample from both populations
2) Variable is normally distributed in both populations
3) SD/variance is the same in both populations