Final Exam Terms Flashcards
When are sample proportions normally distributed (i.e. what assumptions must be met?)
np >= 10
n(1 - p) >= 10
The sample proportions might NOT follow a normal distribution if …
- p is close to 0 or 1
- small n
When do we use two-sided or one-sided tests?
When the sign of Ha is does not equal, use both tails ( p x 2)
If Ha is < or >, use one tail to compute p
What assumptions must be met for a one proportion CI?
nphat >= 10
n(1 - phat) >= 10
How do we calculate a sample size for a one proportion test?
n = (z*/ME)^2 (p squiggle)(1 - p squiggle)
p squiggle is an estimate for the proportion
ALWAYS ROUND UP
What is p~ ( p squiggle)
it is the estimated proportion
If not provided p squiggle is 0.5
How is a hypothesis test set up for a two proportion test?
H0: p1 = p2
Ha: p1 does not equal p2
How do you interpret the confidence interval of a two proportion test?
We are __% confident that the difference in the population proportions of (insert variables) is between ___ and ____.
How do you set up a hypothesis test for a chi square test?
H0: p1 = p2 = p3 ….
Ha: some p does not equal some value
What is the formula for chi square goodness of fit test?
X(chi)^2 = the sum of (observed - expected)^2/expected
In a chi square test, what is the formula for expected counts?
n(p sub i)
pi is given in the null hypothesis
When do we use a t distribution?
1 or 2 paired means
When do we use a z distribution?
1 or 2 proportions
When do we use a chi square distribution?
When we have more than 2 proportions
How does a chi-square distribution appear? (i.e. what shape?)
right skewed
What happens to a chi-square distribution as df increases?
The degree of skew decreases and approaches a normal distribution.
What assumptions must be met for chi-square distribution?
each of the expected counts must be >= 5
What tail test do we use for finding p-value with a chi-square test?
always the right tail
What chi-square test do we use for two categorical variables?
chi square test for association
(goodness of fit for one categorical variable)
How do we set up a hypothesis test for a chi square test for association?
H0: variable A is not associated with variable B
Ha: variable A is associated with variable B
How do we calculate expected counts for chi-square test of association?
expected count = (row total x column total)/sample size (n)
How do we calculate degrees of freedom for chi-square tests?
goodness of fit: df = k -1
association: df = (r - 1)(c - 1)
How do we graph two quantitative variables?
scatterplot
What does correlation do?
Measures the strength and direction of a linear relationship between two quantitative variables.
How do we describe correlation in terms of paramaters and statistics?
paramater: rho
statistic: r aka correlation coefficient
What is the correlation coefficient range?
the smallest r can be is -1, the largest it can be is 1
What does it mean if r is positive?
as one variable increases, the other variable increases
direct/positive relationship
What does it mean if r is negative?
as one variable increases, the other variable decreases
inverse relationship
What does it mean if r is 0?
There is no linear relationship
The farther r is away from zero …
The stronger the linear relationship
(min -1, max +1)
How do you set up a correlation hypothesis test?
H0: rho = 0
(no linear relationship, variables are not correlated)
Ha: rho does not equal 0
(linear relationship, variables are correlated)
Does correlation imply causation?
Not always, must consider if an experiment is observational or experimental, or if there are any confounding variables
Is r resistant to outliers?
no
What does linear regression do?
Uses one quantitative variable to predict changes in another quantitative variable.
or using an explanatory variable to predict changes in the response variable
What is the linear regression equation?
y hat = a + bx
y hat: predicted response value
a = y intercept; predicted value of y when x = 0
b = slope; change in y for one unit change in x
What is the difference between simple and multiple linear regression.
simple has one explanatory variable
multiple has two or more explanatory variables
How is the residual calculated?
residual = actual y - predicted y
or = y - y hat
What does it mean if the residual is positive or negative?
positive residuals are above the line of best fit
negative residuals are below the line of best fit
What is ANOVA?
analysis of variance; helps determine if there is a difference between two or more means
For ANOVA, what are the factor and response
factor is the x variable, a categorical variable
response is the y variable, a quantitative variable
How is the hypothesis test set up for ANOVA?
H0: mu1 = mu2 = mu3 …
Ha: at least one mu does not equal another mu
How is df error found?
Df total (calculated as normal, n - 1) - Df factor (#groups - 1)
Describe F-distributions
F-distributions are right skewed, must use a right tail test when using the F-statistic to find the p value
How do you interpret a Tukey Comparison?
They ensure that the Type-1 error rate is not inflated.
As long as the data spread is not overlapping 0, the means are different.