How do you know how much confidence to put in the outcome of a hypothesis test? The statistician's criterion is the statistical significance of the test, or the likelihood of obtaining a given result by chance. This concept has been spoken of already, using several terms: probability, area of the curve, Type I error rate, and so forth. Another common representation of significance is the letter p (for probability) and a number between 0 and 1. There are several ways to refer to the significance level of a test, and it is important to be familiar with them. All of the following statements, for example, are equivalent:

  • The finding is significant at the 0.05 level.

  • The confidence level is 95 percent.

  • The Type I error rate is 0.05.

  • The alpha level is 0.05.

  • α = 0.05.

  • There is a 1 in 20 chance of obtaining this result (or one more extreme).

  • The area of the region of rejection is 0.05.

  • The p‐value is 0.05.

  • p = 0.05.

The smaller the significance level p, the more stringent the test and the greater the likelihood that the conclusion is correct. The significance level usually is chosen in consideration of other factors that affect and are affected by it, like sample size, estimated size of the effect being tested, and consequences of making a mistake. Common significance levels are 0.10 (1 chance in 10), 0.05 (1 chance in 20), and 0.01 (1 chance in 100).

The result of a hypothesis test, as has been seen, is that the null hypothesis is either rejected or not. The significance level for the test is set in advance by the researcher in choosing a critical test value. When the computed test statistic is large (or small) enough to reject the null hypothesis, however, it is customary to report the observed (actual) p‐value for the statistic.

If, for example, you intend to perform a one‐tailed (lower tail) test using the standard normal distribution at p = 0.05 , the test statistic will have to be smaller than the critical z‐value of –1.65 in order to reject the null hypothesis. But suppose the computed z‐score is –2.50, which has an associated probability of 0.0062. The null hypothesis is rejected with room to spare. The observed significance level of the computed statistic is p = 0.0062; so you could report that the result was significant at p < 0.01. This result means that even if you had chosen the more stringent significance level of 0.01 in advance, you still would have rejected the null hypothesis, which is stronger support for your research hypothesis than rejecting the null hypothesis at p = 0.05.

It is important to realize that statistical significance and substantive, or practical, significance are not the same thing. A small, but important, real‐world difference may fail to reach significance in a statistical test. Conversely, a statistically significant finding may have no practical consequence. This finding is especially important to remember when working with large sample sizes because any difference can be statistically significant if the samples are extremely large.