Exam 3 Flashcards
What is important to remember about the sampling distribution of means?
A population with a normal distribution has a distribution of sample means that is normal
What statistic can you use to test a distribution of sample means? (first one)
- Z standardization, however you rarely know the population standard deviation so substitute and use a students t-distribution
Describe the student’s t distribution.
- similar to standard normal distribution (z) but with fatter tails
- as the sample size increases, the t distribution becomes more like the standard normal distribution
What can the t-distribution be used for?
- It can be used to accurately calculate a confidence interval for the mean of a population with a normal distribution
- (population mean) - (Tcritical value x SE(standard error of mean) )< (actual mean) < (population mean) + (Tcritical value x SE(standard error of mean) )
What is the standard error of the mean?
SE (y) = s / sqrt(n)
What is a one sample t-test?
compares the mean of a random sample from a normal population with the population mean proposed in a null hypothesis
What are the hypotheses and test statistic in a one sample t test?
H0 - true mean equals u0
Ha - true mean does not equal u0
How do you interpret the t-statistic in a one sample t-test?
- compute the p value: probability of this t-statistic or more extreme given the null hypothesis is true
- if p value is >.05 then you fail to reject the null hypothesis
How does increasing sample size affect a one sample t test?
- increasing sample size reduces the standard error of the mean
- increase the probability of rejecting a false null hypothesis (power)
What are the assumptions of a one-sample t-test?
- data are a random sample from the population
- variable is normally distributed in the population (robust to departures)
What are the confidence intervals for variance and standard deviation? And what are the assumptions of these statistics?
Assumptions: random sample from the population, variable must have a normal distribution (formulas are NOT robust to departures from normality)
What are two different study designs when comparing two means?
two-sample and paired designs
What is a two sample design?
- two groups
- each group is composed of independent sample of units
What is a paired designs?
- two groups
- each sampled unit receives both treatments
- paired designs are usually more powerful because of control for variation among sampling units
How are paired designs treated?
- paired measurements are converted to a single measurement by taking the difference between them
What is a paired t-test?
- used to test the null hypothesis that the mean difference of paired measurements equals a specific value
- null is often that the difference (change) is zero before and after treatment
How does a paired t-test compare to a one sample t-test?
- The same process except the calculation of the test statistic occurs on the difference value (d)
What does the p-value indicate in a paired t-test statistic?
- P-value >0.05
- Fail to reject the null hypothesis that the mean change is zero
- P-value <0.05
- reject the null hypothesis that the mean change is zero
What are the assumptions of a paired t-test?
- sampling units are randomly sampled from the population
- paired differences have a normal distribution in the population
What test is a formal test of normality?
the shapiro-wilk test
What are the hypotheses of the shapiro-wilk test? Why should it be used with caution?
- H0 = sample has normal distribution
- Ha = sample does not have normal distribution
- Should be used with caution:
- small sample sizes lack power to reject a false null (Type 2 error)
- large sample sizes can reject null when the departure from normality is minimal and would not affect methods that assume normality
Under the null hypothesis, the sampling distribution of the one-sample t statistic follows a _________
t distribution with n-1 DOF
Describe the t distribution.
-The area under the curve to the left (lower tail) of -t is the same as the area to the right (upper tail) of t
- t distribution is symmetrical around the mode of zero
What does a Shapiro Wilk test evaluate?
- evaluates the goodness of fit of a normal distribution to a set of data randomly sampled from a population
What is something other than a t-test that you may be asked to do with paired measurements?
- Can calculate the 95% CI for true mean difference
- If the range includes zero (no difference)
- other options: may be consistent with a decrease or increase
How do you find the confidence interval for difference between two means (two sample t-test)?
- statistic of interest (mean 1 - mean 2)
- calculate pooled sample variance
- then calculate confidence interval for difference between two means
What is pooled sample variance and what is it used for?
- the averaged of the variances of the samples weighted by their degrees of freedom
- used for calculating confidence interval for the difference between means in two sample test
How do you compare the means in a two sample test?
a two sample t-test
What does a two sample t-test do? What are the null and alternative hypotheses?
- compares the means of a numerical variable with two independent groups
- H0: mean 1 = mean 2
- Ha: mean 1 does not equal mean 2
What are the assumptions of a two sample t-test?
- each of the two samples is a random sample from its population
- numerical variable is normally distributed in each population
(robust to minor deviations)
-standard deviation and variance of the numerical variable is the same in both populations (robust to some deviations if the sample sizes are approximately equal)
What is the formal test of equal variance? What are the hypotheses?
- levenes test
H0: variances are equal
Ha: variances of the two groups are not equal
Can be extended to more than two groups
What do you do to compare the means in a two sample t-test if the variances in the two groups are not equal?
- standard t-test works well if both sample sizes are greater than 30 and there is a less than 3 fold difference in standard deviations
- Welch’s t-test can be used even when the variances of the two groups are not equal - slightly less power compared to the standard t-test
What is Welch’s t-test?
- Welch’s t-test compares the means of two groups and can be used even when the variances of the two groups are not equal
- slightly less power compared to the standard t-test
What is important with units when comparing means of two groups?
- correct sampling units
- when comparing means of two groups an assumption is that the samples being analyzed are random samples, but often repeated measurements are taken on each sampling unit
- fish in streams example, proportion of fish surviving in each stream
What is the fallacy of indirect comparison?
- compare each group mean to hypothesized value rather than comparing group means to each other
- since group 1 is significantly different than zero, but group 2 is not then groups 1 and 2 are significantly different from each other
- comparisons between two groups should be make directly, not indirectly by comparing group to the same hypothesized value
What are the four potential options to address violations of assumptions?
- Ignore the violations
- Transform the data
- Use a non-parametric method
- Use a permutation test