The Observed Significance

The conceptual basis of our testing procedure is that we reject \(H_{0}\) only if the data that we obtained would constitute a rare event if \(H_{\mathrm{O}}\) were actually true. The level of significance \(\alpha\) specifies what is meant by "rare". The observed significance of the test is a measure of how rare the value of the test statistic that we have just observed would be if the null hypothesis were true. That is, the observed significance of the test just performed is the probability that, if the test were repeated with a new sample, the result of the new test would be at least as contrary to \(H_{\mathrm{O}}\) and in support of \(H_{a}\) as what was observed in the original test.


Definition

The observed significance or \(\boldsymbol{p}\)-value of a specific test of hypotheses is the probability, on the supposition that \(H_{\mathrm{O}}\) is true, of obtaining a result at least as contrary to \(H_{\mathrm{O}}\) and in favor of \(H_{a}\) as the result actually observed in the sample data.

Think back to Note 8.27 "Example 4- in Section 8.2 "Large Sample Tests for a Population Mean" concerning the effectiveness of a new pain reliever. This was a left-tailed test in which the value of the test statistic was \(-1.886\). To be as contrary to \(H_{\mathrm{O}}\) and in support of \(H_{a}\) as the result \(Z=-1.886\) actually observed means to obtain a value of the test statistic in the interval \((-\infty,-1.886]\). Rounding \(-1.886\) to \(-1.89\), we can read directly from Figure 12.2 "Cumulative Normal Probability" that \(P(Z \leq-1.89)=0.0294\). Thus the \(p\)-value or observed significance of the test in Note 8.27 "Example 4". is 0.0294 or about 3%. Under repeated sampling from this population, if \(H_{\mathrm{O}}\) were true then only about \(3 \%\) of all samples of size 50 would give a result as contrary to \(H_{\mathrm{O}}\) and in favor of \(H_{a}\) as the sample we observed. Note that the probability 0.0294 is the area of the left tail cut off by the test statistic in this left-tailed test.

Analogous reasoning applies to a right-tailed or a two-tailed test, except that in the case of a twotailed test being as far from \(\mathrm{o}\) as the observed value of the test statistic but on the opposite side of \(\mathrm{o}\) is just as contrary to \(H_{\mathrm{O}}\) as being the same distance away and on the same side of \(\mathrm{o}\), hence the corresponding tail area is doubled.


Computational Definition of the Observed Significance of a Test of Hypotheses

The observed significance of a test of hypotheses is the area of the tail of the distribution cut off by the test statistic (times two in the case of a two-tailed test).


EXAMPLE 6

Compute the observed significance of the test performed in Note 8.28 "Example 5" in Section 8.2 "Large Sample Tests for a Population Mean".


Solution:

The value of the test statistic was \(z=2.490\), which by Figure 12.2 "Cumulative Normal Probability" cuts off a tail of area \(0.0064\), as shown in Figure 8.7 "Area of the Tail for". Since the test was two-tailed, the observed significance is \(2 \times 0.0064=0.0128\).


Figure 8.7

Area of the Tail for Note 8.34 "Example 6"