|MA121: Introduction to Statistics
|Friday, February 23, 2024, 9:20 AM
First, this section discusses whether rejection of the null hypothesis should be an all-or-none proposition. Then, it discusses how to interpret non-significant results; for example, it explains why the null hypothesis should not be accepted or should be accepted with caution. It also describes how a non-significant result can increase confidence that the null hypothesis is false.
Interpreting Significant Results
- Discuss whether rejection of the null hypothesis should be an all-or-none proposition
- State the usefulness of a significance test when it is extremely likely that the null hypothesis of no difference is false even before doing the experiment
When a probability value is below thelevel, the effect is statistically significant and the null hypothesis is rejected. However, not all statistically significant effects should be treated the same way. For example, you should have less confidence that the null hypothesis is false if than . Thus, rejecting the null hypothesis is not an all-or-none proposition.
If the null hypothesis is rejected, then the alternative to the null hypothesis (called the alternative hypothesis) is accepted. Consider the one-tailed test in the James Bond case study: Mr. Bond was given 16 trials on which he judged whether a martini had been shaken or stirred and the question is whether he is better than chance on this task. The null hypothesis for this one-tailed test is that, where is the probability of being correct on any given trial. If this null hypothesis is rejected, then the alternative hypothesis that is accepted. If is greater than , then Mr. Bond is better than chance on this task.
Now consider the two-tailed test used in the Physicians' Reactions case study. The null hypothesis is:
If this null hypothesis is rejected, then there are two alternatives:
Naturally, the direction of the sample means determines which alternative is adopted. If the sample mean for the obese patients is significantly lower than the sample mean for the average-weight patients, then one should conclude that the population mean for the obese patients is lower than the population mean for the average-weight patients.
There are many situations in which it is very unlikely two conditions will have exactly the same population means. For example, it is practically impossible that aspirin and acetaminophen provide exactly the same degree of pain relief. Therefore, even before an experiment comparing their effectiveness is conducted, the researcher knows that the null hypothesis of exactly no difference is false. However, the researcher does not know which drug offers more relief. If a test of the difference is significant, then the direction of the difference is established. This point is also made in the section on the relationship between confidence intervals and significance tests.
Source: David M. Lane, https://onlinestatbook.com/2/logic_of_hypothesis_testing/significant.html
This work is in the Public Domain.
Some textbooks have incorrectly stated that rejecting the null hypothesis that two population means are equal does not justify a conclusion about which population mean is larger. Instead, they say that all one can conclude is that the population means differ. The validity of concluding the direction of the effect is clear if you note that a two-tailed test at the 0.05 level is equivalent to two separate one-tailed tests each at the 0.025 level. The two null hypotheses are then
If the former of these is rejected, then the conclusion is that the population mean for obese patients is lower than that for average-weight patients. If the latter is rejected, then the conclusion is that the population mean for obese patients is higher than that for average-weight patients.
Question 1 out of 3.
Which of the following probability values gives you the most confidence that the null hypothesis is false?
Question 2 out of 3.
You are testing the difference between high school freshmen and seniors on SAT performance. The null hypothesis is that the population mean SAT score of the seniors is equal to the population mean SAT score of the freshmen. You randomly sample 20 students in each grade and have them take the SAT. You find that the sample mean of the seniors is significantly higher than the sample mean of the freshmen. Which alternative hypothesis is accepted?
- The population mean SAT score of the seniors is less than the population mean SAT score of the freshmen.
- The population mean SAT score of the seniors is greater than the population mean SAT score of the freshmen.
- You cannot be sure which alternative hypothesis to accept. You just know that the null hypothesis was rejected.
Question 3 out of 3.
If you are already certain that a null hypothesis is false, then:
- Significance testing provides no useful information since all it does is reject a null hypothesis.
- Significance testing is informative because you still need to know whether an effect is significant even if you know the null hypothesis is false.
- When a difference is significant you can draw a confident conclusion about the direction of the effect.
- The probability value is the proportion of times that you would get a difference in your sample as large or larger than the one you found if the null hypothesis were actually true. Thus, lower probability values make you more confident that the null hypothesis is false. In this case, the lowest probability value is.003.
- The direction of the sample means determines which alternative is adopted. In this example, the sample means show that seniors performed better, so this alternative is accepted.
- A significant result lets you conclude the direction of the effect. After a non-significant result, the direction of the difference is uncertain.
Interpreting Non-Significant Results
- State what it means to accept the null hypothesis
- Explain why the null hypothesis should not be accepted
- Describe how a non-significant result can increase confidence that the null hypothesis is false
- Discuss the problems of affirming a negative conclusion
When a significance test results in a high probability value, it means that the data provide little or no evidence that the null hypothesis is false. However, the high probability value is not evidence that the null hypothesis is true. The problem is that it is impossible to distinguish a null effect from a very small effect. For example, in the James Bond Case Study, suppose Mr. Bond is, in fact, just barely better than chance at judging whether a martini was shaken or stirred. Assume he has a 0.51 probability of being correct on a given trial). Let's say Experimenter Jones (who did not know ) tested Mr. Bond and found he was correct 49 times out of 100 tries. How would the significance test come out? The experimenter's significance test would be based on the assumption that Mr. Bond has a 0.50 probability of being correct on each trial . Given this assumption, the probability of his being correct 49 or more times out of 100 is 0.62. This means that the probability value is 0.62, a value very much higher than the conventional significance level of 0.05. This result, therefore, does not give even a hint that the null hypothesis is false. However, we know (but Experimenter Jones does not) that and not 0.50 and therefore that the null hypothesis is false. So, if Experimenter Jones had concluded that the null hypothesis was true based on the statistical analysis, he or she would have been mistaken. Concluding that the null hypothesis is true is called accepting the null hypothesis. To do so is a serious error.
Do not accept the null hypothesis when you do not reject it.
So how should the non-significant result be interpreted? The experimenter should report that there is no credible evidence Mr. Bond can tell whether a martini was shaken or stirred, but that there is no proof that he cannot. It is generally impossible to prove a negative. What if I claimed to have been Socrates in an earlier life? Since I have no evidence for this claim, I would have great difficulty convincing anyone that it is true. However, no one would be able to prove definitively that I was not.
Often a non-significant finding increases one's confidence that the null hypothesis is false. Consider the following hypothetical example. A researcher develops a treatment for anxiety that he or she believes is better than the traditional treatment. A study is conducted to test the relative effectiveness of the two treatments: 20 subjects are randomly divided into two groups of 10. One group receives the new treatment and the other receives the traditional treatment. The mean anxiety level is lower for those receiving the new treatment than for those receiving the traditional treatment. However, the difference is not significant. The statistical analysis shows that a difference as large or larger than the one obtained in the experiment would occur 11% of the time even if there were no true difference between the treatments. In other words, the probability value is 0.11. A naive researcher would interpret this finding as evidence that the new treatment is no more effective than the traditional treatment. However, the sophisticated researcher, although disappointed that the effect was not significant, would be encouraged that the new treatment led to less anxiety than the traditional treatment. The data support the thesis that the new treatment is better than the traditional one even though the effect is not statistically significant. This researcher should have more confidence that the new treatment is better than he or she had before the experiment was conducted. However, the support is weak and the data are inconclusive. What should the researcher do? A reasonable course of action would be to do the experiment again. Let's say the researcher repeated the experiment and again found the new treatment was better than the traditional treatment. However, once again the effect was not significant and this time the probability value was 0.07. The naive researcher would think that two out of two experiments failed to find significance and therefore the new treatment is unlikely to be better than the traditional treatment. The sophisticated researcher would note that two out of two times the new treatment was better than the traditional treatment. Moreover, two experiments each providing weak support that the new treatment is better, when taken together, can provide strong support. Using a method for combining probabilities, it can be determined that combining the probability values of 0.11 and 0.07 results in a probability value of 0.045. Therefore, these two non-significant findings taken together result in a significant finding.
Although there is never a statistical basis for concluding that an effect is exactly zero, a statistical analysis can demonstrate that an effect is most likely small. This is done by computing a confidence interval. If all effect sizes in the interval are small, then it can be concluded that the effect is small. For example, suppose an experiment tested the effectiveness of a treatment for insomnia. Assume that the mean time to fall asleep was 2 minutes shorter for those receiving the treatment than for those in the control group and that this difference was not significant. If the 95% confidence interval ranged from -4 to 8 minutes, then the researcher would be justified in concluding that the benefit is eight minutes or less. However, the researcher would not be justified in concluding the null hypothesis is true, or even that it was supported.
Question 1 out of 2.
You have just analyzed the results from your experiment, and you calculated. What conclusions can you make? Select all that apply.
- You reject the null hypothesis.
- You accept the null hypothesis.
- You fail to reject the null hypothesis.
- You accept the alternative hypothesis.
Question 2 out of 2.
You have just given a group of 2nd graders and 1st graders a reading test. You found that the 2nd graders performed better than the 1st graders, but you calculated avalue of.08, which was not significant at the.05 level. After getting these results, what should your thoughts be about the difference between 1st and 2nd graders on this reading test?
- You are more confident that there is a difference.
- You are less confident that there is a difference.
- You now know that the difference is actually zero.
- You are unable to reject the null hypothesis or accept the alternative hypothesis if your
- Although you were unable to reject the null hypothesis here, you did find a difference in your sample. Because of this sample difference, you can now be more confident that the population difference does really exist, and doing further research is the best way to find out. You definitely do not accept the null hypothesis.