Sampling Distribution of p

Here, we introduce the mean and standard deviation of the sampling distribution of p and the relationship between the sampling distribution of p and the normal distribution.

Sampling Distribution of p

Learning Objectives

  1. Compute the mean and standard deviation of the sampling distribution of p
  2. State the relationship between the sampling distribution of p and the normal distribution

Assume that in an election race between Candidate A and Candidate B, 0.60 of the voters prefer Candidate A. If a random sample of 10 voters were polled, it is unlikely that exactly 60 \% of them (6) would prefer Candidate A. By chance the proportion in the sample preferring Candidate A could easily be a little lower than 0.60 or a little higher than 0.60. The sampling distribution of p is the distribution that would result if you repeatedly sampled 10 voters and determined the proportion (p) that favored Candidate A.

The sampling distribution of p is a special case of the sampling distribution of the mean. Table 1 shows a hypothetical random sample of 10 voters. Those who prefer Candidate A are given scores of 1 and those who prefer Candidate B are given scores of 0. Note that seven of the voters prefer candidate A so the sample proportion (\mathrm{p}) is

p=7 / 10=0.70

As you can see, p is the mean of the 10 preference scores.

Table 1. Sample of voters.

Voter Preference
1 1
2 0
3 1
4 1
5 1
6 0
7 1
8 0
9 1
10 1

The distribution of \mathrm{p} is closely related to the binomial distribution. The binomial distribution is the distribution of the total number of successes (favoring) Candidate A, for example) whereas the distribution of p is the distribution of the mean number of successes. The mean, of course, is the total divided by the sample size, N. Therefore, the sampling distribution of p and the binomial distribution differ in that p is the mean of the scores (0.70) and the binomial distribution is dealing with the total number of successes (7).

The binomial distribution has a mean of:

\mu=N \Pi

Dividing by N to adjust for the fact that the sampling distribution of p is dealing with means instead of totals, we find that the mean of the sampling distribution of p is:

\mu_{\mathrm{p}}=\Pi

The standard deviation of the binomial distribution is:

\sqrt{N \pi(1-\pi)}

Dividing by N because p is a mean not a total, we find the standard error of p:

\sigma_{p}=\frac{\sqrt{N \pi(1-\pi)}}{N}=\sqrt{\frac{\pi(1-\pi)}{N}}

Returning to the voter example, \Pi=0.60 and N=10. (Don't confuse \Pi=0.60, the population proportion and p=0.70, the sample proportion.) Therefore, the mean of the sampling distribution of \mathrm{p} is 0.60. The standard error is

\sigma_{p}=\sqrt{\frac{0.60(1-.60)}{10}}=0.155

The sampling distribution of \mathrm{p} is a discrete rather than a continuous distribution. For example, with an N of 10, it is possible to have a p of 0.50 or a p of 0.60 but not a p of 0.55.

The sampling distribution of p is approximately normally distributed if N is fairly large and \pi is not close to 0 or 1. A rule of thumb is that the approximation is good if both \mathrm{N} \pi and \mathrm{N}(1-\pi) are greater than 10. The sampling distribution for the voter example is shown in Figure 1. Note that even though N(1-\pi) is only 4, the approximation is quite good.


Figure 1. The sampling distribution of p. Vertical bars are the probabilities; the smooth curve is the normal approximation.


Source: David M. Lane, https://onlinestatbook.com/2/sampling_distributions/samp_dist_p.html
Public Domain Mark This work is in the Public Domain.