Hypothesis Testing with Two Samples

Read this chapter, which discusses how to compare data from two similar groups. This is useful when, for example, you want to analyze things like how someone's income relates to another sample that you are interested in. Make sure you read the introduction as well as sections 10.1 through 10.6. Attempt the practice problems and homework at the end of the chapter.

Two Population Means with Known Standard Deviations

Even though this situation is not likely (knowing the population standard deviations is very unlikely), the following example illustrates hypothesis testing for independent means with known population standard deviations. The sampling distribution for the difference between the means is normal in accordance with the central limit theorem. The random variable is $\overline X_1 – \overline X_2$ . The normal distribution has the following format:

The standard deviation is:

$\sqrt{\dfrac{(σ_1)^2}{n_1}+\dfrac{(σ_2)^2}{n_2}}$

The test statistic (z-score) is:

$Z_c=\dfrac{(\overline x_1– \overline x_2)–δ_0}{\sqrt{\dfrac{(σ_1)^2}{n_1}+\dfrac{(σ_2)^2}{n_2}}}$

Example 10.7

Independent groups, population standard deviations known: The mean lasting time of two competing floor waxes is to be compared. Twenty floors are randomly assigned to test each wax. Both populations have a normal distributions. The data are recorded in Table 10.3.

Wax	Sample mean number of months floor wax lasts	Population standard deviation
1	3	0.33
2	2.9	0.36

Table 10.3

Problem

Does the data indicate that wax 1 is more effective than wax 2? Test at a 5% level of significance.

Solution 1
This is a test of two independent groups, two population means, population standard deviations known.

Random Variable:

$\overline X_1 – \overline X_2$ = difference in the mean number of months the competing floor waxes last.

$H_0:μ_1≤μ_2$

$H_a:μ_1>μ_2$

The words "is more effective" says that wax 1 lasts longer than wax 2, on average. "Longer" is a ">" symbol and goes into H_a. Therefore, this is a right-tailed test.

Distribution for the test: The population standard deviations are known so the distribution is normal. Using the formula for the test statistic we find the calculated value for the problem.

$Z_c=\dfrac{(μ_1−μ_2)−δ_0}{\sqrt{\dfrac{σ^2_1}{n_1}+\dfrac{σ^2_2}{n_2}}}=0.1$

Figure 10.7

The estimated difference between he two means is :

$\overline X_1 – \overline X_2= 3 – 2.9 = 0.1$

Compare calculated value and critical value and Z_α: We mark the calculated value on the graph and find the calculated value is not in the tail therefore we cannot reject the null hypothesis.

Make a decision: the calculated value of the test statistic is not in the tail, therefore you cannot reject H_0.

Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean time wax 1 lasts is longer (wax 1 is more effective) than the mean time wax 2 lasts.

Try It 10.7

The means of the number of revolutions per minute of two competing engines are to be compared. Thirty engines of each type are randomly assigned to be tested. Both populations have normal distributions. Table 10.4 shows the result. Do the data indicate that Engine 2 has higher RPM than Engine 1? Test at a 5% level of significance.

Engine	Sample mean number of RPM	Population standard deviation
1	1,500	50
2	1,600	60

Table 10.4

Example 10.8

An interested citizen wanted to know if Democratic U. S. senators are older than Republican U.S. senators, on average. On May 26 2013, the mean age of 30 randomly selected Republican Senators was 61 years 247 days old (61.675 years) with a standard deviation of 10.17 years. The mean age of 30 randomly selected Democratic senators was 61 years 257 days old (61.704 years) with a standard deviation of 9.55 years.

Problem

Do the data indicate that Democratic senators are older than Republican senators, on average? Test at a 5% level of significance.

Solution 1
This is a test of two independent groups, two population means. The population standard deviations are unknown, but the sum of the sample sizes is 30 + 30 = 60, which is greater than 30, so we can use the normal approximation to the Student’s-t distribution. Subscripts: 1: Democratic senators 2: Republican senators

Random variable:

$\overline X_1 – \overline X_2$ = difference in the mean age of Democratic and Republican U.S. senators.

$H_0:μ_1≤μ_2 H_0:μ_1−μ_2≤0$

$H_a:μ_1 > μ_2 H_a:μ_1−μ_2 > 0$

The words "older than" translates as a ">" symbol and goes into H_a. Therefore, this is a right-tailed test.

Figure 10.8

Make a decision: The p-value is larger than 5%, therefore we cannot reject the null hypothesis. By calculating the test statistic we would find that the test statistic does not fall in the tail, therefore we cannot reject the null hypothesis. We reach the same conclusion using either method of a making this statistical decision.

Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean age of Democratic senators is greater than the mean age of the Republican senators.