Hypothesis Testing Flashcards
What is type 1 error? What is it equal to? Give an example.
Type 1 error = reject H0 when it’s true
P(Type 1 error) = alpha - significance level
E.g. Convicting an innocent person
What is a type 2 error? What is it equal to? Give an example
Type 2 error = accepting H0 when it’s false
P(Type 2 error) = 1 - Power
E.g. letting a guilt person go free
What does it mean if we have a smaller significant level?
Smaller alpha = smaller type 1 error
We want to minimise the chance of convicting an innocent person = more likely to accept H0
Normal CV for 10% in one tail
1.280
Normal CV for 5% in one tail
1.645
Normal CV for 2.5% in one tail
1.960
Normal CV for 1% in one tail
2.320
Normal CV for 0.5% in one tail
2.575
If H0: M=70
H1: M<70
And alpha =5%, what is the CV if the underlying distribution is normal?
CV for 5% = 1.645
Since One sided alternative is less than:
CV = negative 1.645
For basic hypothesis test: what test do we use if X is NOT normal but n>25/30?
N>23/30 = can invoke CLT
X bar is approximately normal with a mean M and variance sigma^2/n or S^2/n. Whether sigma^2 is known or not, we still do Z test.
For basic hypothesis test: is X is normal but sigma^2 unknown, what test do we do?
T test using S^2 (sample variance)
T= X bar - M0 / sqrt S^2/n
If the underlying series is a Bernoulli trial, and n>25/30, how is X bar distributed?
M = p and sigma^2 = p(1 - p)
X bar is approx normal with mean of p and p(1 - p) / n
Diff in means: expectation and variance of X1 bar - X2 bar
E(X1 bar - X2 bar) = 0
V(X1 bar - X2 bar) = (sigma1) ^2 / n1 + (sigma2) ^2 / n2
As independent random samples covariance = 0
Diff in means: if X1 & X2 Normal but population variances unknown
- Test equality of variances
If can assume sigma1 squared = sigma2 squared, pool variances & dof = n1 + n2 - 2
If sigmas not equal, do not pool and use complicated dof formula.
What test do we do for equality of variance?
F test = S1 ^2 / S2 ^2 for S1 ^2 > S2 ^2
What test do we do to test the variance of a distribution? What condition must hold for us to do this?
X MUST be normally distributed.
Chi-Squared test: Xn-1 = (n - 1) S^2 / sigma0 ^2
What is matched pairs?
X1 & X2 are not independent - they are from the same sample.
Variance of matched pairs vs random sample
Matched pairs: V(X1 - X2 all bar) = sigma1 ^2 + sigma2 ^2 - 2COV / n
Random sample: V(X1 bar - X2 bar) sigma1 ^2 / n1 + sigma2 ^2 / n2
As COV is positive, variance of matched pairs smaller so more efficient
Matched pairs test statistic
Normal but sigma unknown –> t test
T = d bar - 0 / (sqrt Sd ^2 / n)
What value of sigma do we assume if it is not given and we don’t have the sample variance either
0.5(0.5)
This gives the greatest value
What is power?
Power = 1 - Type 2 error
Power = P( reject H0 given H0 false)
Probability of correctly rejected the null.
3 stages to working out power
- What is our CV - what distribution? 2-sided = +- CV
- Convert CV to our null distribution = Xc
- Under MT, work out probability Z is > Xc if H1 is >
Can power be in both tails? When?
If we have a 2 sided alternative, we use +- CV which gives us two Xc s for power.
What values can power lie between?
Alpha = significant level and 100%
When is power = significance level?
When the null distribution and true distributions are the same.
What is ANOVA? Why use it?
ANOVA = analysis of variance
Allows us to test equality of means among > 2 groups instead of doing separate tests which makes it inevitable that we reject H0 at some point
3 assumptions for ANOVA
- X normally distributed
- Variances of each group the same
- Independent random samples
Test statistic for ANOVA
F = [BSS / (k - 1)] / [WSS / (n - k)]
For ANOVA how do we calculate the overall sample mean X bar?
X bar = X1bar n1 + X2bar n2 + … + Xkbar nk / n
We weight the sample Mean for each group by the number in the group.
DOF for BSS for ANOVA
K - 1 where k is the number of groups
DOF for WSS for ANOVA
n - k where n is overall sample size and k is the number of groups
Diff in means where we are looking at proportions, how do we work out variances?
S1^2 & S2^2 worked out using X1bar and X2 bar proportions success x proportion failure.
Since P1=P2 under H0, pool sample variances so find S0^2 & use this.
If n>25/30 and are using sample PROPORTIONS, what distribution do we use for difference in means and if sigmas unknown, what variance?
Invoke CLT = approx normal.
Use a POOLED sample variance: (n1 - 1)proportion 1 + (n2 - 1)proportion 2 / n1 + n2 - 2