Sample Tests for a Population Mean: Large Sample Tests for a Population Proportion | Saylor Academy

Large Sample Tests for a Population Proportion

LEARNING OBJECTIVES

To learn how to apply the five-step critical value test procedure for test of hypotheses concerning a population proportion.
To learn how to apply the five-step $p$ -value test procedure for test of hypotheses concerning a population proportion.

Both the critical value approach and the $p$ -value approach can be applied to test hypotheses about a population proportion $p$ . The null hypothesis will have the form $H_{0}: p=p_{0}$ for some specific number $p_{\mathrm{o}}$ between $\mathrm{o}$ and 1. The alternative hypothesis will be one of the three inequalities $p < p_{0}$ , $p > p_{0}$ , or $p \neq p_{0}$ for the same number $p_{0}$ that appears in the null hypothesis.

The information in Section 6.3 "The Sample Proportion" in Chapter 6 "Sampling Distributions" gives the following formula for the test statistic and its distribution. In the formula $p_{0}$ is the numerical value of $p$ that appears in the two hypotheses, $q_{0}=1-p_{0}, \hat{p}$ is the sample proportion, and $n$ is the sample size. Remember that the condition that the sample be large is not that $n$ be at least 30 but that the interval

$\left[\hat{p}-3 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}, \hat{p}+3 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right]$

lie wholly within the interval $[0,1]$ .

Standardized Test Statistic for Large Sample Hypothesis Tests Concerning a Single Population Proportion

$Z=\dfrac{\hat{p}-p_{0}}{\sqrt{\dfrac{p_{0} q_{0}}{n}}}$

The test statistic has the standard normal distribution.

The distribution of the standardized test statistic and the corresponding rejection region for each form of the alternative hypothesis (left-tailed, right-tailed, or two-tailed), is shown in Figure 8.14 "Distribution of the Standardized Test Statistic and the Rejection Region".

Figure 8.14 Distribution of the Standardized Test Statistic and the Rejection Region

EXAMPLE 12

A soft drink maker claims that a majority of adults prefer its leading beverage over that of its main competitor's. To test this claim 500 randomly selected people were given the two beverages in random order to taste. Among them, 270 preferred the soft drink maker's brand, 211 preferred the competitor's brand, and 19 could not make up their minds. Determine whether there is sufficient evidence, at the 5% level of significance, to support the soft drink maker's claim against the default that the population is evenly split in its preference.

Solution:

We will use the critical value approach to perform the test. The same test will be performed using the $p$ -value approach in Note 8.49 "Example 14".

We must check that the sample is sufficiently large to validly perform the test. Since $\hat{p}=270 / 500=0.54$

$\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}=\sqrt{\frac{(0.54)(0.46)}{500}} \approx 0.02$

and

$\left[\hat{p}-3 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}, \hat{p}+3 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right]$

$=[0.54-(3)(0.02), 0.54+(3)(0.02)]$

$=[0.48,0.60] \subset[0,1]$

so the sample is sufficiently large.

Step 1. The relevant test is

$\begin{aligned} H_{0}: p &=0.50 \\ \text { vs. } H_{a}: p & > 0.50 @ \alpha=0.05\end{aligned}$

where $p$ denotes the proportion of all adults who prefer the company's beverage over that of its competitor's beverage.

Step 2. The test statistic is

$Z=\frac{\hat{p}-p_{0}}{\sqrt{\frac{p_{0} q_{0}}{n}}}$

and has the standard normal distribution.

Step 3. The value of the test statistic is

$Z=\frac{\hat{p}-p_{0}}{\sqrt{\frac{p_{0} q_{0}}{n}}}=\frac{0.54-0.50}{\sqrt{\frac{(0.50)(0.50)}{500}}}=1.789$

Step 4. Since the symbol in $H_{a}$ is " $>$ " this is a right-tailed test, so there is a single critical value, $z_{\alpha}=z_{0.05}$ . Reading from the last line in Figure 12.3 "Critical Values of" its value is 1.645. The rejection region is $[1.645, \infty)$ .
Step 5. As shown in Figure 8.15 "Rejection Region and Test Statistic for " the test statistic falls in the rejection region. The decision is to reject $H_{0}$ . In the context of the problem our conclusion is:

The data provide sufficient evidence, at the 5% level of significance, to conclude that a majority of adults prefer the company's beverage to that of their competitor's.

Figure 8.15

Rejection Region and Test Statistic for Note 8.47 "Example 12"

EXAMPLE 13

Globally the long-term proportion of newborns who are male is 51.46%. A researcher believes that the proportion of boys at birth changes under severe economic conditions. To test this belief randomly selected birth records of 5,000 babies born during a period of economic recession were examined. It was found in the sample that 52.55% of the newborns were boys. Determine whether there is sufficient evidence, at the 10% level of significance, to support the researcher's belief.

Solution:

We will use the critical value approach to perform the test. The same test will be performed using the $p$ -value approach in Note 8.50 "Example 15".

The sample is sufficiently large to validly perform the test since

$\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}=\sqrt{\frac{(0.5255)(0.4745)}{5000}} \approx 0.01$

hence

$\begin{aligned} &{\left[\hat{p}-3 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}, \hat{p}+3 \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right]} \\ &=[0.5255-0.03,0.5255+0.03] \\ &=[0.4955,0.5555] \subset[0,1] \end{aligned}$

Step 1. Let p be the true proportion of boys among all newborns during the recession period. The burden of proof is to show that severe economic conditions change it from the historic long-term value of 0.5146 rather than to show that it stays the same, so the hypothesis test is

$\begin{aligned} H_{0}: p &=0.5146 \\ \text { vs. } H_{a}: p & \neq 0.5146 @ \alpha=0.10 \end{aligned}$

Step 2. The test statistic is

$Z=\frac{\hat{p}-p_{0}}{\sqrt{\frac{p_{0} q_{0}}{n}}}$

and has the standard normal distribution.

Step 3. The value of the test statistic is

$Z=\frac{\hat{p}-p_{0}}{\sqrt{\frac{p_{0} q_{0}}{n}}}=\frac{0.5255-0.5146}{\sqrt{\frac{(0.5146)(0.4854)}{5000}}}=1.542$

Step 4. Since the symbol in $H_{a}$ is " $\neq$ " this is a two-tailed test, so there are a pair of critical values, $\pm z_{\alpha / 2}=\pm z_{0.05}=\pm 1.645$ . The rejection region is $(-\infty,-1.645] \cup[1.645, \infty)$ .
Step 5. As shown in Figure 8.16 "Rejection Region and Test Statistic for " the test statistic does not fall in the rejection region. The decision is not to reject $H_{0}$ . In the context of the problem our conclusion is:

The data do not provide sufficient evidence, at the 10% level of significance, to conclude that the proportion of newborns who are male differs from the historic proportion in times of economic recession.

Figure 8.16

Rejection Region and Test Statistic for Note 8.48 "Example 13"

EXAMPLE 14

Perform the test of Note 8.47 "Example 12" using the $p$ -value approach.

Solution:

We already know that the sample size is sufficiently large to validly perform the test.

Steps 1–3 of the five-step procedure described in Section 8.3.2 "The " have already been done in Note 8.47 "Example 12" so we will not repeat them here, but only say that we know that the test is right-tailed and that value of the test statistic is $Z = 1.789$ .
Step 4. Since the test is right-tailed the $p$ -value is the area under the standard normal curve cut off by the observed test statistic, $z = 1.789$ , as illustrated in Figure 8.17. By Figure 12.2 "Cumulative Normal Probability" that area and therefore the $p$ -value is $1−0.9633=0.0367$ .
Step 5. Since the $p$ -value is less than $\alpha = 0.05$ the decision is to reject $H_{0}$ .

Figure 8.17

$P$ -Value for Note 8.49 "Example 14"

EXAMPLE 15

Perform the test of Note 8.48 "Example 13" using the $p$ -value approach.

Solution:

We already know that the sample size is sufficiently large to validly perform the test.

Steps 1–3 of the five-step procedure described in Section 8.3.2 "The " have already been done in Note 8.48 "Example 13". They tell us that the test is two-tailed and that value of the test statistic is Z = 1.542.
Step 4. Since the test is two-tailed the $p$ -value is the double of the area under the standard normal curve cut off by the observed test statistic, $z = 1.542$ . By Figure 12.2 "Cumulative Normal Probability" that area is $1−0.9382=0.0618$ , as illustrated in Figure 8.18, hence the $p$ -value is $2 \times 0.0618=0.1236$ .
Step 5. Since the $p$ -value is greater than $\alpha=0.10$ the decision is not to reject $H_{0}$ .

Figure 8.18

$P$ -Value for Note 8.50 "Example 15"

COURSE INTRODUCTION

Course Syllabus

Unit 1: Statistics and Data

1.1.1: What is Statistics?

What are Statistics?

1.1.2: Descriptive and Inferential Statistics

Descriptive and Inferential Statistics

Basic Definitions and Concepts

1.1.3: Types of Data and Their Collection

Variables and Data Collection

Presenting Data

1.2.1: Graphical Methods for Describing Quantitative Data

Graphing

Three Popular Data Displays

1.2.2: Numerical Measures of Central Tendency and Variability

Numerical Measures of Central Tendency and Variability

Measures of Central Location

Mean, Median, Mode, and Variance

1.2.3: Methods for Describing Relative Standing

Percentiles

1.2.4: Methods for Describing Bivariate Relationships

Scatterplots and Bivariate Data

Pearson's r

Unit 1 Assessment

Unit 1 Assessment

Unit 2: Elements of Probability and Random Variables

2.1.1: Events, Sample Spaces, and Probability

Introduction to Probability

Basic Concepts of Probability

2.1.2: Counting Rules

Permutations and Combinations

The Addition Rule for Probability with a Venn Diagram Example

2.2.1: Common Discrete Random Variables

Random Variables and Probability Distributions

Binomial Distributions

Binomial, Poisson, and Multinomial Distributions

2.2.2: Normal Distribution

The Standard Normal Distribution

More on Normal Distributions

Introduction to the Normal Distribution

Unit 2 Assessment

Unit 2 Assessment

Unit 3: Sampling Distributions

3.1.1: Continuous Random Variables

Continuous Random Variables

3.1.2: Definition and Interpretation

Introduction to Sampling Distributions

3.1.3: Sampling Distributions Properties

Wolfram Demonstrations Project

3.2.1: The Sampling Distribution of Sample Mean

The Sampling Distribution of a Sample Mean

The Mean, Standard Deviation, and Sampling Distribution of the Sample Mean

Sampling Distribution

3.2.2: The Sampling Distribution of Pearson's r

Sampling Distribution of r

3.2.3: The Sampling Distribution of the Sample Proportion

Sampling Distribution of p

Standard Deviation

Unit 3 Assessment

Unit 3 Assessment

Unit 4: Estimation with Confidence Intervals

4.1.1: Sample Statistics and Parameters

Basic Sample Statistics and Parameters

4.1.2: Bias and Sampling Variability

Characteristics of Estimators

4.2.1: Confidence Intervals for Mean

Confidence Intervals for the Mean

Demonstration: Confidence Intervals for a Mean

t Distribution Demonstration

Comparing Normal and Student's t-Distributions

4.2.2: Confidence Intervals for Correlation and Proportion

Confidence Intervals for Correlation and Proportion

Confidence Intervals

Unit 4 Assessment

Unit 4 Assessment

Unit 5: Hypothesis Test

5.1.1: Setting up Hypotheses

Setting Up Hypotheses

5.1.2: Interpreting Hypotheses Testing Results

The Observed Significance of a Test