Section 31-32, 40, 42 Null Hypothesis, z-Test, 1-tail and 2-tail Tests Flashcards
Null Hypothesis (Directional & Alternative)
NULL HYPOTHESIS says that the TRUE DIFFERENCE BETWEEN the SAMPLE MEANS (or between the Sample and Population means) is ZERO even if the OBSERVED difference in the means is NOT zero.
-
Ex: Suppose that we drew a random sample of first-grade girls and a random sample of first-grade boys from a large school district in order to estimate the average reading achievement of each group on a standardized test. The means obtained were Girls: m = 50.00, Boys: m = 46.00.
- The results suggest that girls, on average, have higher reading achievement. But do they? Remember that only a random sample of the boys and a random sample of the girls were tested. Thus, it is possible that the difference between the two means is due ONLY to the errors created by random sampling, (i.e. sampling error)
- In other words, it is possible that the POPULATION mean for boys is identical to the POPULATION mean for girls, and a difference between the two SAMPLE means was found only because of the effects of random sampling. This possibility is known as the NULL HYPOTHESIS.
- The results suggest that girls, on average, have higher reading achievement. But do they? Remember that only a random sample of the boys and a random sample of the girls were tested. Thus, it is possible that the difference between the two means is due ONLY to the errors created by random sampling, (i.e. sampling error)
- Expressed with symbols: H0: µ1 - µ2 = 0
- Where
- H0 is the symbol for the NULL HYPOTHESIS
- µ1 is the symbol for the population mean for one group
- µ2 is the symbol for the population mean for the other group.
- Where
- Another way to express the NULL HYPOTHESIS is:
- There is no true difference between the means.
- The observed difference between the means was created by sampling error.
- Most researchers SEARCH FOR DIFFERENCES among individual cases and for EXPLANATIONS for the differences they find. Therefore, most researchers do not undertake their studies in the hope of confirming the null hypothesis (which would indicate NO true difference).
- But the null hypothesis remains as a possible explanation for any observed differences.
RESEARCH HYPOTHESIS – is an ALTERNATIVE HYPOTHESIS, one devised by the researcher to reflect their predictions for the relationship between the groups being compared.
-
Ex: They might hypothesize that the average reading achievement of girls is higher than that of boys. In this case, the research hypothesis is an alternative to the Null Hypothesis.
- When the RESEARCH HYPOTHESIS specifies that one particular group’s average is higher than that of another group, it is called a DIRECTIONAL HYPOTHESIS because it indicates the direction of the difference.
- Expressed with symbols: H1: µ1 > µ2
- Where
- H1 is the symbol for the ALTERNATIVE HYPOTHESIS (i.e. an alternative to the NULL HYPOTHESIS)
- µ1 is the symbol for the population mean for the group hypothesized to have a higher mean (in this case, the girls)
- µ2 is the symbol for the population mean for the other group (in this case, the boys)
- Where
- Another researcher may hold a NONDIRECTIONAL HYPOTHESIS as his or her research hypothesis. That is, the researcher believes there is a difference between boys’ and girls’ reading achievement but doesn’t speculate which group is higher.
- Expressed with symbols: H1: µ1 ≠ µ2
- Where
- H1 is the symbol for the ALTERNATIVE HYPOTHESIS (i.e. an alternative to the NULL HYPOTHESIS)
- µ1 is the symbol for the population mean for one group
- µ2 is the symbol for the population mean for the other group
- Where
- The DIRECTIONAL HYPOTHESIS is the most FREQUENT one used.
- Even with the introduction of a RESEARCH HYPOTHESIS, the existence of the NULL HYPOTHESIS remains unless sufficient evidence is shown to REJECT The NULL HYPOTHESIS (more on that later).
- Ex: Suppose that the two means we considered at the beginning of this section (i.e., m= 50.00 for girls and m= 46.00 for boys) were obtained by a researcher who started with the directional research hypothesis that girls, on average, read better than boys. Clearly, the observed means support the research hypothesis, but is the researcher finished? Obviously not because there are still two possible explanations for the observed difference:
- The RESEARCH HYPOTHESIS – Girls have higher reading achievement than boys.
- The NULL HYPOTHESIS – The observed DIFFERENCE in the SAMPLE MEANS is the result of the EFFECTS of RANDOM SAMPLING. Therefore, there is no true difference.
- The researcher can’t simply accept the initial finding through a single random sample just because it fits her RESEARCH HYPOTHESIS. Further testing must take place.
- Because the NULL HYPOTHESIS REMAINS a possible explanation for the difference, the NULL HYPOTHESIS CANNOT BE REJECTED at this point.
- For the researcher, this is not a definitive result. To make it definitive, the researcher should try to rule out the null hypothesis and leave only the original research hypothesis.
- The researcher would use INFERENTIAL STATISTICS (covered in detail in the remainder of this book) to TEST the NULL HYPOTHESIS to determine whether it is reasonable to rule it out as an explanation for the difference.
Null Hypothesis (Alpha, Error, and Significance)
NULL HYPOTHESIS – States that there is NO TRUE DIFFERENCE BETWEEN MEANS, that any difference between the SAMPLE means (or between a sample mean and the population mean) was obtained only because of sampling errors created by random sampling.
-
There is ALWAYS some probability that the null hypothesis is true. If researchers wait for certainty, they will never be able to make a decision. So researchers have settled on the .05 level as the level at which it is APPROPRIATE to REJECT the NULL HYPOTHESIS.
- In other words, if there is a 5% or less chance that the sample means are different due only to random sampling error, then you should REJECT the NULL HYPOTHESIS – (i.e. REJECT the notion that there is no true difference between the POPULATION MEANS and that it cannot be ruled out that the difference between sample means is due exclusively to RANDOM SAMPLING ERROR)
- So REJECTING the NULL HYPOTHESIS means that you can say with some certainty that the POPULATION MEANS DO DIFFER and that the difference in sample means is NOT due exclusively to Random Sampling Error.
- In other words, if there is a 5% or less chance that the sample means are different due only to random sampling error, then you should REJECT the NULL HYPOTHESIS – (i.e. REJECT the notion that there is no true difference between the POPULATION MEANS and that it cannot be ruled out that the difference between sample means is due exclusively to RANDOM SAMPLING ERROR)
ALPHA LEVEL – The PROBABILITY (p) at which researchers are WILLING to REJECT the NULL HYPOTHESIS.
- When an ALPHA = .05 (p = .05) is used, researchers are, in effect, willing to be wrong 5 times in 100 when REJECTING the NULL HYPOTHESIS.
- Thus, in rejecting the null hypothesis, we are taking a calculated risk that we might be wrong. This type of error is known as a Type I error:
-
TYPE I Error (alpha error): REJECTING the NULL HYPOTHESIS when you should NOT.
- Rejecting a true null hypothesis.
- FALSE POSITIVE – Declaring results to be STATISTICALLY SIGNIFICANT when they are NOT.
- Known as a “False Alarm”
- The PROBABILITY of a TYPE 1 ERROR uses the symbol p.
- So p = .05 means that the PROBABILITY that we have REJECTED The NULL HYPOTHESIS when we SHOULD NOT is equal to .05 or 5%.
- Another way to say “rejecting the null hypothesis” is to declare a result to be STATISTICALLY SIGNIFICANT.
- i.e. The difference between the means is statistically significant – indicating that the researchers have rejected the null hypothesis.
-
Other levels of ALPHA that are commonly used are p < .01 (less than 1 in 100 = 1%) and p < .001 (less than 1 in 1,000 = 0.1%).
-
The smaller the ALPHA, the lower the probability that we are REJECTING The NULL HYPOTHESIS when we should NOT.
- And the GREATER the STATISTICAL SIGNIFICANCE.
-
The smaller the ALPHA, the lower the probability that we are REJECTING The NULL HYPOTHESIS when we should NOT.
- An ALPHA of p = .06+ level is deemed NOT STATISTICALLY SIGNIFICANT – and so we do NOT REJECT The NULL HYPOTHESIS at that Alpha level
NOTE: Keep in mind, though, that when you require a lower probability before rejecting the null hypothesis (e.g., .01 instead of .05), you are increasing the likelihood that you will make a Type II error:
-
TYPE II Error (aka beta error): FAIL to REJECT the NULL HYPOTHESIS when you SHOULD.
- Fail to reject a false null hypothesis.
- FALSE NEGATIVE – Declaring results NOT STATISTICALLY SIGNIFICANT when they ARE significant.
- Ex: Suppose a drug company developed a new drug for a serious disease and that, in reality, the new drug is effective. If, however, the null hypothesis is not rejected because the drug company selected a level of significance that is too high (an ALPHA Value p that is too low – say p < .01, when they should have used p < .05), the results of the study will have to be described as insignificant (when it actually IS significant), and the drug may not receive government approval.
NOTE: Either decision about the null hypothesis (reject or fail to reject) may be wrong, but by using inferential statistics to make the decisions, researchers can report the probability that they have made a Type I error (indicated by the p-value included in the report).
IMPORTANT: “NOT REJECTING” is NOT the same as “ACCEPTING”:
- Accepting the Null Hypothesis would indicate that you’ve proven an effect doesn’t exist.
-
Failing to reject the Null Hypothesis indicates that our sample did not provide sufficient evidence to conclude that the effect exists. But this lack of evidence doesn’t prove that the effect does not exist.
- Absence of proof is not proof of absence.
Null Hypothesis (z Test for One Sample)
To DETERMINE whether or not we can REJECT the NULL HYPOTHESIS (given both a POPULATION and SAMPLE mean and standard deviation), we can _CALCULATE the *z*-score of the SAMPLE_ and see if it lies outside the pre-determined level of desired confidence for STATISTICAL SIGNIFICANCE.
- Ex: The MEAN SAT score for the U.S. POPULATION is µ = 500.00, and the STANDARD DEVIATION of the POPULATION is σ = 100.00.
- Suppose researchers for Alabama suspects that their students, on average, performed more poorly than the national population. (This is their RESEARCH HYPOTHESIS.)
- They drew a random SAMPLE of n = 200 students who took the test and found that the SAMPLE MEAN m = 485.00 and the SAMPLE STANDARD DEVIATION s = 101.00
- At first, the data seem to support their RESEARCH HYPOTHESIS: Alabama students scored 15 points below the national population. However, the NULL HYPOTHESIS also offers an explanation for the 15-point difference. It states that the difference was created by sampling errors due to the random sampling-that, in fact, the true difference is zero.
- So BOTH the RESEARCH HYPOTHESIS and the NULL HYPOTHESIS remain as possible explanations for the difference between the Sample and Population means.
- Since WE CAN ONLY TEST THE NULL HYPOTHESIS, we have to test to see whether or not the null hypothesis is viable, we can TEST IT WITH A z-Test.
- Recall that a z-score tells you the number of standard deviations a score lies from the mean and in which direction, to the left or to the right of the mean.
- Because we are dealing with SAMPLE MEANS and not a single set of data, we modify the original z-score equation by changing the Standard Deviation (of a single data set) that was in the denominator of the original equation to the STANDARD ERROR of the MEANS because we’re working with the means. (See the formulas below).
- Note that the earlier version of the formula included the sample standard deviation (s) in the numerator, and the current version includes the population standard deviation (σ). Although the Standard Error of the Means can be calculated either way, using the POPULATION standard error will produce a better result and should be used if it is AVAILABLE.
- Recall that a z-score of greater than 1.96 and less than -1.96 occur less than 5% of the time in a normal distribution. Thus, we can say that the probability of drawing a person on a single random draw who has a z-score this extreme is an unlikely event.
- First, calculate the STANDARD ERROR of the MEANS (SEm) as shown in the formula below, and insert that into the formula for z-scores below (See accompanying Spreadsheet for “z-Test for 1 Sample (Null Hyp)”)
- The result in this example is z = -2.121
- To evaluate our z of -2.121, we will first use the constants 1.96 and -1.96 (the standard deviations that coincide with the PRECISE 95% level – meaning that 95% of all observations fall within +/-1.96 standard deviations from the mean. Thus only 5% of all observations lie outside that z-score.
- Since our z-score is -2.121 < -1.96, we can report the finding in one of two ways (both ways having the same meaning and implications)
- The null hypothesis has been rejected at the .05 level – that there is only a 5% chance that the DIFFERENCE in the MEANS is due ONLY to RANDOM SAMPLING ERROR.
- The difference is statistically significant at the .05 level.
- More specifically, we interpret the resulting z-score in terms of TWO TAILS:
- The SAMPLE mean is significantly LOWER than the POPULATION mean (as indicated by a z as extreme as -1.96 or LOWER – as is the case in this example.)
- The SAMPLE mean is significantly HIGHER than the POPULATION mean (as indicated by a z as extreme as +1.96 or HIGHER).
- We can also evaluate our value of z using the constants 2.58 and -2.58 (the standard deviations that coincide with the PRECISE 99% level – meaning that 99% of all observations fall within +/-2.58 standard deviations. Thus only 1% of all observations lie outside that z-score.
- Since our z-score is -2.121 > -2.58, we can report the finding in one of two ways (both ways having the same meaning and implications)
- The null hypothesis has NOT been rejected at the .01 level– that there is GREATER THAN a 1% chance that the DIFFERENCE in the MEANS is due ONLY to RANDOM SAMPLING ERROR.
-
The difference is NOT statistically significant at the .01 level.
- Because we obtained a z of -2.121, our result is NOT sufficiently extreme to classify this as an unlikely event at the .01 level.
- BEFORE examining the data, you should select an ALPHA level (usually .05 or .01) that will be used in the significance test. Had you chosen the .05 level in the above example, you would report to your audience that the difference is significant at that level. Had you initially chosen the .01 level, you would report that the difference is not significant at that level.
-
IMPORTANT: We can NEVER DIRECTLY TEST the RESEARCH HYPOTHESIS with statistics. We can only EVER test the NULL HYPOTHESIS with the goal of being able to REJECT IT.
- Tests exist only for the null hypothesis. Therefore, if we fail to reject the null hypothesis, we have an inconclusive result because there remain two hypotheses that could explain the difference.
One-Tailed Vs. Two-Tailed Tests
TWO-TAILED TEST – Testing for Statistical Significance (or the Threshold for rejecting the Null Hypothesis) by setting the z-score threshold to at least -1.96 on the left and +1.96 on the right (for p < .05 Alpha level) (See figure 1. below)
- This two-directional test looks at the possibility that the mean being tested could be EITHER HIGHER or LOWER than the POPULATION mean by a statistically significant amount.
- The two tails together add to the 5% alpha level, but that makes the challenge of reaching the .05 ALPHA level on each tail MUCH MORE DIFFICULT.
- So, if you only care (i.e. your RESEARCH HYPOTHESIS states) that you think the group being tested should, for instance, be LOWER than a larger population, why look at both tails at a +/- 1.96 level, when you could look ONLY only at the tail you’re concerned with (on the left)? Then, you could use -1.65 as your z-score level, much easier to attain than -1.96. (See figure 2. below)
ONE-TAILED TEST – looks only at one tail of the normal distribution when deciding on whether or not to REJECT the NULL HYPOTHESIS (i.e. REJECT The idea that the difference in means is due solely to sampling error).
- If you are interested in the Left tail, for example, but do a two-tail test, then your z-score must meet the -1.96 threshold in order for you to REJECT The NULL HYPOTHESIS (i.e. differences are due only to random sampling error). If you do a ONE-TAIL TEST, your z-score must surpass only -1.65.
- The ONE-TAILED test is generally frowned upon because using it raises eyebrows – why wouldn’t you want to know if the data your testing falls to the OPPOSITE TAIL with statistical significance?
-
There are reasons to use a ONE-TAILED test beyond the fact that it’s easier to reach the less-intense z-score threshold, but the reason must be compelling to both you AND the people reading the research.
- The problem with using ONE-TAIL tests is that it makes the RESEARCHER look like they’re trying to hide something – like the possibility that whatever they are testing is actually statistically HIGHER rather than lower as their RESEARCH HYPOTHESIS supposed.
- By using a two-tailed test, we are willing and prepared to detect a difference in EITHER direction.
- Should you ever use a one-tailed test? Yes, if you have a directional hypothesis and can convince yourself and your audience that a significant difference in the direction other than the one you hypothesized is of no interest.
- The problem with using ONE-TAIL tests is that it makes the RESEARCHER look like they’re trying to hide something – like the possibility that whatever they are testing is actually statistically HIGHER rather than lower as their RESEARCH HYPOTHESIS supposed.
-
There are reasons to use a ONE-TAILED test beyond the fact that it’s easier to reach the less-intense z-score threshold, but the reason must be compelling to both you AND the people reading the research.
- By the way, you must determine your ALPHA Level (.001, .01, or .05) and the type of TAIL TEST (ONE or TWO) BEFORE you begin your analysis. And they should be chosen based on the criteria that you feel makes sense with the data and relationships you are studying.
- Since a two-tailed test provides more flexibility in examining the outcomes of a study, why would someone choose a one-tailed test? One possible reason is that a one-tailed test makes it easier to reject the null hypothesis – but in one, and only one, direction.
- For this reason, we’ll focus on two-tailed tests for the following reasons:
- Many consumers of research frown on a one-tailed test. They suspect that it may have been chosen only because it made it possible to report a significant difference and not because the underlying logic of the study justified a one-tailed test.
- In most cases, it is difficult to justify a one-tailed test. It can be justified only if you can convince your audience that there would be no interest in and no implications from a significant difference in a direction other than the one hypothesized in a directional research hypothesis. Usually, an astute consumer can imagine implications.