Sampling Distribution of p

Sampling Distribution of p

Learning Objectives

  1. Compute the mean and standard deviation of the sampling distribution of \(p\)
  2. State the relationship between the sampling distribution of \(p\) and the normal distribution

Assume that in an election race between Candidate \(A\) and Candidate \(B, 0.60\) of the voters prefer Candidate \(A\). If a random sample of \(10\) voters were polled, it is unlikely that exactly \(60 \%\) of them \((6)\) would prefer Candidate \(A\). By chance the proportion in the sample preferring Candidate \(A\) could easily be a little lower than \(0.60\) or a little higher than \(0.60\). The sampling distribution of \(p\) is the distribution that would result if you repeatedly sampled \(10\) voters and determined the proportion \((p)\) that favored Candidate \(A\).

The sampling distribution of \(p\) is a special case of the sampling distribution of the mean. Table 1 shows a hypothetical random sample of \(10\) voters. Those who prefer Candidate \(A\) are given scores of \(1\) and those who prefer Candidate \(B\) are given scores of \(0\). Note that seven of the voters prefer candidate \(A\) so the sample proportion \((\mathrm{p})\) is

\(p=7 / 10=0.70\)

As you can see, \(p\) is the mean of the \(10\) preference scores.

Table 1. Sample of voters.

Voter Preference
1 1
2 0
3 1
4 1
5 1
6 0
7 1
8 0
9 1
10 1

The distribution of \(\mathrm{p}\) is closely related to the binomial distribution. The binomial distribution is the distribution of the total number of successes (favoring) Candidate \(A\), for example) whereas the distribution of \(p\) is the distribution of the mean number of successes. The mean, of course, is the total divided by the sample size, \(N\). Therefore, the sampling distribution of \(p\) and the binomial distribution differ in that \(p\) is the mean of the scores \((0.70)\) and the binomial distribution is dealing with the total number of successes (7).

The binomial distribution has a mean of:

\(\mu=N \Pi\)

Dividing by \(N\) to adjust for the fact that the sampling distribution of \(p\) is dealing with means instead of totals, we find that the mean of the sampling distribution of \(p\) is:

\(\mu_{\mathrm{p}}=\Pi\)

The standard deviation of the binomial distribution is:

\(\sqrt{N \pi(1-\pi)}\)

Dividing by \(N\) because \(p\) is a mean not a total, we find the standard error of \(p\):

\(\sigma_{p}=\frac{\sqrt{N \pi(1-\pi)}}{N}=\sqrt{\frac{\pi(1-\pi)}{N}}\)

Returning to the voter example, \(\Pi=0.60\) and \(N=10\). (Don't confuse \(\Pi=0.60\), the population proportion and \(p=0.70\), the sample proportion.) Therefore, the mean of the sampling distribution of \(\mathrm{p}\) is \(0.60\). The standard error is

\(\sigma_{p}=\sqrt{\frac{0.60(1-.60)}{10}}=0.155\)

The sampling distribution of \(\mathrm{p}\) is a discrete rather than a continuous distribution. For example, with an \(N\) of \(10\), it is possible to have a \(p\) of \(0.50\) or a \(p\) of \(0.60\) but not a \(p\) of \(0.55\).

The sampling distribution of \(p\) is approximately normally distributed if \(N\) is fairly large and \(\pi\) is not close to \(0\) or \(1\). A rule of thumb is that the approximation is good if both \(\mathrm{N} \pi\) and \(\mathrm{N}(1-\pi)\) are greater than \(10\). The sampling distribution for the voter example is shown in Figure 1. Note that even though \(N(1-\pi)\) is only \(4\), the approximation is quite good.


Figure 1. The sampling distribution of \(p\). Vertical bars are the probabilities; the smooth curve is the normal approximation.


Source: David M. Lane, https://onlinestatbook.com/2/sampling_distributions/samp_dist_p.html
Public Domain Mark This work is in the Public Domain.