One- and Two-Tailed Tests
In the previous example, you tested a research hypothesis that predicted not only that the sample mean would be different from the population mean but that it would be different in a specific direction—it would be lower. This test is called a directional or one‐tailed test because the region of rejection is entirely within one tail of the distribution.
Some hypotheses predict only that one value will be different from another, without additionally predicting which will be higher. The test of such a hypothesis is nondirectional or two‐tailed because an extreme test statistic in either tail of the distribution (positive or negative) will lead to the rejection of the null hypothesis of no difference.
Suppose that you suspect that a particular class's performance on a proficiency test is not representative of those people who have taken the test. The national mean score on the test is 74.
The research hypothesis is:
The mean score of the class on the test is not 74.
Or in notation: H a : μ ≠ 74
The null hypothesis is:
The mean score of the class on the test is 74.
In notation: H 0: μ = 74
As in the last example, you decide to use a 5 percent probability level for the test. Both tests have a region of rejection, then, of 5 percent, or 0.05. In this example, however, the rejection region must be split between both tails of the distribution—0.025 in the upper tail and 0.025 in the lower tail—because your hypothesis specifies only a difference, not a direction, as shown in Figure 1(a). You will reject the null hypotheses of no difference if the class sample mean is either much higher or much lower than the population mean of 74. In the previous example, only a sample mean much lower than the population mean would have led to the rejection of the null hypothesis.
Figure 1.Comparison of (a) a two‐tailed test and (b) a one‐tailed test, at the same probability level (95 percent).
The decision of whether to use a one‐ or a two‐tailed test is important because a test statistic that falls in the region of rejection in a one‐tailed test may not do so in a two‐tailed test, even though both tests use the same probability level. Suppose the class sample mean in your example was 77, and its corresponding z‐score was computed to be 1.80. Table 2 in "Statistics Tables" shows the critical z‐scores for a probability of 0.025 in either tail to be –1.96 and 1.96. In order to reject the null hypothesis, the test statistic must be either smaller than –1.96 or greater than 1.96. It is not, so you cannot reject the null hypothesis. Refer to Figure 1(a).
Suppose, however, you had a reason to expect that the class would perform better on the proficiency test than the population, and you did a one‐tailed test instead. For this test, the rejection region of 0.05 would be entirely within the upper tail. The critical z‐value for a probability of 0.05 in the upper tail is 1.65. (Remember that Table 2 in "Statistics Tables" gives areas of the curve below z; so you look up the z‐value for a probability of 0.95.) Your computed test statistic of z = 1.80 exceeds the critical value and falls in the region of rejection, so you reject the null hypothesis and say that your suspicion that the class was better than the population was supported. See Figure 1(b).
In practice, you should use a one‐tailed test only when you have good reason to expect that the difference will be in a particular direction. A two‐tailed test is more conservative than a one‐tailed test because a two‐tailed test takes a more extreme test statistic to reject the null hypothesis.