week 7: hypothesis testing Flashcards
Errors in hypothesis testing
Type I errors
Type II errors
Relationship between Type I and Type II errors
Power
Type I errors
when the null hypothesis (H0) is actually true
α level
Type II errors
when the research hypothesis (H1) is actually true
β
Null hypothesis
there is no effect
Research hypothesis
there is an effect
Statistical significance
if the probability of a score occurring is less than 5% (p < .05), then we interpret that it is unlikely to be due to chance
Statistical significance example
we know that the probability of an IQ score being 130 or greater is only 2%
in this case, we would tentatively conclude that the radiation may have had an effect on the person
i.e., We conclude the null hypothesis of no effect is unlikely to explain this score (< 2%) and therefore accept that the research hypothesis of an effect of radiation is a more probable explanation
Hypothesis testing
a systematic procedure for determining whether the results of an experiment, which examines a sample that represents the population
The process of hypothesis testing
Step 1: formulating research and null hypotheses
Step 2: identifying the comparison distribution
Step 3: determining the cutoff score
Step 4: where does your sample score sit on the comparison distribution?
Step 5: decision time - should the null hypothesis be rejected?
Step 1: Formulating research and null hypotheses
think about the objective of the research and restate the research question as:
a research hypothesis about populations
an accompanying null hypothesis about populations
Step 1 example
Radiation and intelligence
research hypothesis: radiation exposure improves intellectual functioning
null hypothesis: radiation exposure does not affect intellectual functioning
hypothesis formulas
µe > µne this is the research hypothesis
µe = µne this is the null hypothesis
radiation IQ example formulas
the IQ of people exposed to radiation will be higher than the IQ of people not exposed
µe > µne this is the research hypothesis
the IQ of people exposed to radiation will be the same as the IQ of people not exposed
µe = µne this is the null hypothesis
Step 2: Identifying the comparison distribution
we always test against the null hypothesis (i.e., we assume that there is no difference and we try to find one by showing the chance of no effect is unlikely)
thus, we assume that if the null hypothesis is true:
µe = µne
step 2 radiation example
under the null hypothesis, people exposed to radiation will have a similar IQ to those not exposed
we know the characteristics of the distribution of IQ scores in the population (M=100, SD=15)
we test our sample data (the sample statistic) against this distribution specified under the null hypothesis
Step 3: Determining the cut-off score
most common is a cut-off of 5% of the distribution
known as the significance level
what is the Z giving us 5%? (z = 1.645)
when the obtained score exceeds the critical value
the null hypothesis is rejected
have a statistically significant result
Step 4: Where does your sample score sit on the comparison distribution?
find the Z score of your sample result, based on the comparison mean and standard deviation
Step 4 IQ radiation example
a person with an IQ of 130 has a Z score of +2 on the distribution of the IQ in the population
Step 5: Decision time: Should the null hypothesis be rejected?
is the sample Z score beyond the critical value?
Yes: reject the null hypothesis
No: stay with the null hypothesis (for now)
Directional hypotheses
specifies the direction of the difference between the two means
eg. µe > µne
Two-tailed tests
it is not known in which direction the difference will lie (eg positive or negative), BUT a difference is hypothesised
Cut-off points for two-tailed tests
consider the situation in which a 5% significance level is chosen
need to find cut-off for top 2.5% and bottom 2.5%
when is it easier to reject the null hypothesis
it is easier to reject the null hypothesis with a one-tailed test than with a two-tailed test at the same significance level
Research Hypothesis
that our IV of interest has had an effect on our DV
this effect can be directional: level 1 will be greater than level 2 or vice versa
i.e., H1: µ1 > µ2 or µ1 < µ2
or no direction: there is a difference but the direction is not specified
i.e., H1: µ1 ≠ µ2
Null Hypothesis
is the competing or opposite of the research hypothesis, that there is no effect
i.e., H0: µ1 = µ2
critical value or cut-off value
one-tailed: with all 5% in one tail (therefore, a Z score of ±1.64)
two-tailed: with the 5% divided into the two tails producing 2.5% in each tail (thus, a Z score cut-off of ±1.96)
whether to reject the NULL hypothesis
- If the sample score is greater than the cut-off (i.e., further out into tails) we reject the null hypothesis and conclude that the IV had an effect on the DV
- If the sample score is less than the cut-off (i.e., closer to the mean) we retain the null hypothesis and conclude that there was no evidence that the IV had an effect on the DV
Example
Step1:
Research Hypothesis: Combination Therapy improves depression as measured by MMPI
Null Hypothesis: Combination Therapy has no effect on Depression as measured by MMPI
H1: µT < µNT H0: µT = µ NT
Step 2: Comparison distribution:
Normed distribution of MMPI depression scores of clinical clients (higher score means less depression)
Step 3: Cut-off: 1-tailed Z = 1.645; 2-tailed Z = ±1.96
Step 4: Where does your score sit
Zobt = 2.02 > Zcrit = 1.96
Step 5 Decision: Reject null hypothesis and accept research hypothesise that therapy improved depression.
what is significance?
Significance is deciding which distribution
a statistic is more likely to belong to: H0 or H1
When we say a result is significant
we are either correct or making a Type 1 error
When we say a result is not significant
we are either correct or making a Type 2 error
power
the power of a test is its ability to correctly reject the null hypothesis
when can power occur
this can only occur when the H1 is true
what is power
the power is the area of the H1 distribution which is beyond the critical value on the H0 distribution
what type of error is power associated with
power is the ‘complement’ of the Type II error, b
power = 1-b
why do researchers like to have lots of power (>.8)
it increases their chances of a correct significant result
Power = 1 - β
Probability of finding a significant effect when one
exists in the population