# The Mean, Standard Deviation, and Sampling Distribution of the Sample Mean

 Site: Saylor Academy Course: MA121: Introduction to Statistics Book: The Mean, Standard Deviation, and Sampling Distribution of the Sample Mean
 Printed by: Guest user Date: Tuesday, August 6, 2024, 11:36 PM

## Description

This section gives several concrete examples of calculating the exact distributions of the sample mean. The corresponding means and standard deviations are computed for demonstration based on these distributions. Next, it discusses sampling distributions of sample means when the sample size is large. It also considers the case when the population is normal. Finally, it uses the central limit theorem for large sample approximations.

## The Mean and Standard Deviation of the Sample Mean

### Learning Objectives

1. To become familiar with the concept of the probability distribution of the sample mean.
2. To understand the meaning of the formulas for the mean and standard deviation of the sample mean.

Suppose we wish to estimate the mean $\mu$ of a population. In actual practice we would typically take just one sample. Imagine however that we take sample after sample, all of the same size $n$, and compute the sample mean $\bar{x}$ of each one. We will likely get a different value of $\bar{x}$ each time. The sample mean $\bar{x}$ is a random variable: it varies from sample to sample in a way that cannot be predicted with certainty. We will write $\bar{X}$ when the sample mean is thought of as a random variable, and write $\bar{x}$ for the values that it takes. The random variable $\bar{X}$ has a mean, denoted $\mu_{X}$, and a standard deviation, denoted $\sigma_{X}$. Here is an example with such a small population and small sample size that we can actually write down every single sample.

### Example 1

A rowing team consists of four rowers who weigh $152,156,160$, and $164$ pounds. Find all possible random samples with replacement of size two and compute the sample mean for each one. Use them to find the probability distribution, the mean, and the standard deviation of the sample mean $\bar{X}$.

Solution

The following table shows all possible samples with replacement of size two, along with the mean of each:

$\begin{array}{|c|c|c|c|c|c|c|c|}\hline \text { Sample } & \text { Mean } & \text { Sample } & \text { Mean } & \text { Sample } & \text { Mean } & \text { Sample } & \text { Mean } \\\hline 152,152 & 152 & 156,152 & 154 & 160,152 & 156 & 164,152 & 158 \\\hline 152,156 & 154 & 156,156 & 156 & 160,156 & 158 & 164,156 & 160 \\\hline 152,160 & 156 & 156,160 & 158 & 160,160 & 160 & 164,160 & 162 \\\hline 152,164 & 158 & 156,164 & 160 & 160,164 & 162 & 164,164 & 164 \\\hline\end{array}$

The table shows that there are seven possible values of the sample mean $\bar{X}$. The value $\bar{x}=152$ happens only one way (the rower weighing $152$ pounds must be selected both times), as does the value $\bar{x}=164$, but the other values happen more than one way, hence are more likely to be observec than $152$ and $164$ are. Since the $16$ samples are equally likely, we obtain the probability distribution of the sample mean just by counting:

$\begin{array}{c|ccccccc}\bar{x} & 152 & 154 & 156 & 158 & 160 & 162 & 164 \\\hline P(\bar{x}) & \frac{1}{16} & \frac{2}{16} & \frac{3}{16} & \frac{4}{16} & \frac{3}{16} & \frac{2}{16} & \frac{1}{16}\end{array}$

Now we apply the formulas from Section 4.2.2 "The Mean and Standard Deviation of a Discrete Random Variable" in Chapter 4 "Discrete Random Variables" for the mean and standard deviation of a discrete random variable to $\bar{X}$. For $\mu_{X}$ we obtain.

\begin{aligned}\mu_{\bar{X}} &=\Sigma \bar{x} P(\bar{x}) \\&=152\left(\frac{1}{16}\right)+154\left(\frac{2}{16}\right)+156\left(\frac{3}{16}\right)+158\left(\frac{4}{16}\right)+160\left(\frac{3}{16}\right)+162\left(\frac{2}{16}\right)+164\left(\frac{1}{16}\right) \\&=158\end{aligned}

For $\sigma \bar{X}$ we first compute $\Sigma \bar{x}^{2} P(\bar{x})$:

$152^{2}\left(\frac{1}{16}\right)+154^{2}\left(\frac{2}{16}\right)+156^{2}\left(\frac{3}{16}\right)+158^{2}\left(\frac{4}{16}\right)+160^{2}\left(\frac{3}{16}\right)+162^{2}\left(\frac{2}{16}\right)+1$

which is $24,974$, so that

$\sigma_{\bar{X}}=\sqrt{\Sigma \bar{x}^{2} P(\bar{x})-\mu_{\bar{x}}^{2}}=\sqrt{24,974-158^{2}}=\sqrt{10}$

The mean and standard deviation of the population $\{152,156,160,164\}$ in the example are $\mu=158$ and $\sigma=\sqrt{20}$. The mean of the sample mean $\bar{X}$ that we have just computed is exactly the mean of the population. The standard deviation of the sample mean $\bar{X}$ that we have just computed is the standard deviation of the population divided by the square root of the sample size: $\sqrt{10}=\sqrt{20} / \sqrt{2}$. These relationships are not coincidences, but are illustrations of the following formulas.

Suppose random samples of size $n$ are drawn from a population with mean $\mu$ and standard deviation $\sigma$. The mean $\mu_{X}$ and standard deviation $\sigma_{X}$ of the sample mean $\bar{X}$ satisfy

$\mu_{\bar{X}}=\mu \quad \text { and } \quad \sigma_{\bar{X}}=\frac{\sigma}{\sqrt{n}}$

The first formula says that if we could take every possible sample from the population and compute the corresponding sample mean, then those numbers would center at the number we wish to estimate, the population mean $\mu$.

The second formula says that averages computed from samples vary less than individual measurements on the population do, and quantifies the relationship.

### Example 2

The mean and standard deviation of the tax value of all vehicles registered in a certain state are $\mu=\ 13,525$ and $\sigma=\ 4,180$. Suppose random samples of size $100$ are drawn from the population of vehicles. What are the mean $\mu_{\bar{X}}$ and standard deviation $\sigma_{\bar{X}}$ of the sample mean $\bar{X}$?

Solution

Since $n=100$, the formulas yield

$\mu_{\bar{X}}=\mu=\ 13,525 \quad \text { and } \quad \sigma_{\bar{X}}=\frac{\sigma}{\sqrt{n}}=\frac{\ 4180}{\sqrt{100}}=\ 418$

### Key Takeaways

• The sample mean is a random variable; as such it is written $\bar{X}$, and $\bar{x}$ stands for individual values it takes.
• As a random variable the sample mean has a probability distribution, a mean $\mu_{\bar{X}}$, and a standard deviation $\sigma_{\bar{X}}$.
• There are formulas that relate the mean and standard deviation of the sample mean to the mean and standard deviation of the population from which the sample is drawn.

### Exercises

1. Random samples of size $225$ are drawn from a population with mean $100$ and standard deviation $20$. Find the mean and standard deviation of the sample mean.

3. A population has mean $75$ and standard deviation $12$.

a. Random samples of size $121$ are taken. Find the mean and standard deviation of the sample mean.

b. How would the answers to part (a) change if the size of the samples were $400$ instead of $121$?

1. $\mu_{\bar{X}}=100, \sigma_{\bar{X}}=1.33$

3. a. $\mu_{\bar{X}}=75, \sigma_{\bar{X}}=1.09$
b. $\mu_{\bar{X}}$ stays the same but $\sigma_{\bar{X}}$ decreases to $0.6$

## The Sampling Distribution of the Sample Mean

### Learning Objectives

1. To learn what the sampling distribution of $\bar{X}$ is when the sample size is large.
2. To learn what the sampling distribution of $\bar{X}$ is when the population is normal.

### The Central Limit Theorem

In Note 6.5 "Example 1" in Section 6.1 "The Mean and Standard Deviation of the Sample Mean" we constructed the probability distribution of the sample mean for samples of size two drawn from the population of four rowers. The probability distribution is:

$\begin{array}{c|ccccccc}\bar{x} & 152 & 154 & 156 & 158 & 160 & 162 & 164 \\\hline P(\bar{x}) & \frac{1}{16} & \frac{2}{16} & \frac{3}{16} & \frac{4}{16} & \frac{3}{16} & \frac{2}{16} & \frac{1}{16}\end{array}$

Figure 6.1 "Distribution of a Population and a Sample Mean" shows a side-by-side comparison of a histogram for the original population and a histogram for this distribution. Whereas the distribution of the population is uniform, the sampling distribution of the mean has a shape approaching the shape of the familiar bell curve. This phenomenon of the sampling distribution of the mean taking on a bell shape even though the population distribution is not bell-shaped happens in general. Here is a somewhat more realistic example.

Figure 6.1 Distribution of a Population and a Sample Mean

Suppose we take samples of size $1$, $5$, $10$, or $20$ from a population that consists entirely of the numbers $0$ and $1$, half the population $0$, half $1$, so that the population mean is $0.5$. The sampling distributions are:

$n=1$:

$\begin{array}{l|cc}\bar{x} & 0 & 1 \\\hline P(\bar{x}) & 0.5 & 0.5\end{array}$

$n = 5$:

$\begin{array}{l|cccccc}\bar{x} & 0 & 0.2 & 0.4 & 0.6 & 0.8 & 1 \\\hline P(\bar{x}) & 0.03 & 0.16 & 0.31 & 0.31 & 0.16 & 0.03\end{array}$

$n = 10$:

$\begin{array}{l|ccccccccccc}\bar{x} & 0 & 0.1 & 0.2 & 0.3 & 0.4 & 0.5 & 0.6 & 0.7 & 0.8 & 0.9 & 1 \\\hline P(\bar{x}) & 0.00 & 0.01 & 0.04 & 0.12 & 0.21 & 0.25 & 0.21 & 0.12 & 0.04 & 0.01 & 0.00\end{array}$

$n = 20$:

\begin{aligned}&\begin{array}{l|ccccccccccc}\bar{x} & 0 & 0.05 & 0.10 & 0.15 & 0.20 & 0.25 & 0.30 & 0.35 & 0.40 & 0.45 & 0.50 \\\hline P(\bar{x}) & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.01 & 0.04 & 0.07 & 0.12 & 0.16 & 0.18\end{array}\\&\begin{array}{l|llllllllll}\bar{x} & 0.55 & 0.60 & 0.65 & 0.70 & 0.75 & 0.80 & 0.85 & 0.90 & 0.95 & 1 \\\hline P(\bar{x}) & 0.16 & 0.12 & 0.07 & 0.04 & 0.01 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00\end{array}\end{aligned}

Histograms illustrating these distributions are shown in Figure 6.2 "Distributions of the Sample Mean".

Figure 6.2 Distributions of the Sample Mean

As $n$ increases the sampling distribution of $\bar{X}$ evolves in an interesting way: the probabilities on the lower and the upper ends shrink and the probabilities in the middle become larger in relation to them. If we were to continue to increase $n$ then the shape of the sampling distribution would become smoother and more bell-shaped.

What we are seeing in these examples does not depend on the particular population distributions involved. In general, one may start with any distribution and the sampling distribution of the sample mean will increasingly resemble the bell-shaped normal curve as the sample size increases. This is the content of the Central Limit Theorem.

### The Central Limit Theorem

For samples of size $30$ or more, the sample mean is approximately normally distributed, with mean $\mu_{X}=\mu$ and standard deviation $\sigma_{X}=\sigma / \sqrt{n}$, where $n$ is the sample size. The larger the sample size, the better the approximation.

The Central Limit Theorem is illustrated for several common population distributions in Figure 6.3 "Distribution of Populations and Sample Means".

Figure 6.3 Distribution of Populations and Sample Means

The dashed vertical lines in the figures locate the population mean. Regardless of the distribution of the population, as the sample size is increased the shape of the sampling distribution of the sample mean becomes increasingly bell-shaped, centered on the population mean. Typically by the time the sample size is $30$ the distribution of the sample mean is practically the same as a normal distribution.

The importance of the Central Limit Theorem is that it allows us to make probability statements about the sample mean, specifically in relation to its value in comparison to the population mean, as we will see in the examples. But to use the result properly we must first realize that there are two separate random variables (and therefore two probability distributions) at play:

1. $X$, the measurement of a single element selected at random from the population; the distribution of $X$ is the distribution of the population, with mean the population mean $\mu$ and standard deviation the population standard deviation $\sigma$;
2. $\bar{X}$, the mean of the measurements in a sample of size $n$; the distribution of $\bar{X}$ is its sampling distribution, with mean $\mu_{X}=\mu$ and standard deviation $\sigma_{X}=\sigma / \sqrt{n}$.

### Example 3

Let $\bar{X}$ be the mean of a random sample of size $50$ drawn from a population with mean $112$ and standard deviation $40$.

a. Find the mean and standard deviation of $\bar{X}$.
b. Find the probability that $\bar{X}$ assumes a value between $110$ and $114$.
c. Find the probability that $\bar{X}$ assumes a value greater than $113$.

Solution:

a. By the formulas in the previous section

$\mu_{\bar{X}}=\mu=112 \text { and } \sigma_{\bar{X}}=\frac{\sigma}{\sqrt{n}}=\frac{40}{\sqrt{50}}=5.65685$

b. Since the sample size is at least $30$, the Central Limit Theorem applies: $\bar{X}$ is approximately normally distributed. We compute probabilities using Figure 12.2 "Cumulative Normal Probability" in the usual way, just being careful to use $\sigma \bar{X}$ and not $\sigma$ when we standardize:

\begin{aligned} P(110 < \bar{X} < 114) &=P\left(\frac{110-\mu_{X}}{\sigma_{X}} < Z < \frac{114-\mu_{X}}{\sigma_{X}}\right) \\ &=P\left(\frac{110-112}{5.65685} < Z < \frac{114-112}{5.65685}\right) \\ &=P(-0.35 < Z < 0.35)=0.6368-0.3632=0.2736 \end{aligned}

c. Similarly

\begin{aligned}P(\bar{X} > 113) &=P\left(Z > \frac{113-\mu_{X}}{\sigma_{X}}\right) \\&=P\left(Z > \frac{113-112}{5.65685}\right) \\&=P(Z > 0.18) \\&=1-P(Z < 0.18)=1-0.5714=0.4286\end{aligned}

Note that if in Note 6.11 "Example 3" we had been asked to compute the probability that the value of a single randomly selected element of the population exceeds $113$, that is, to compute the number $P(X >Â 113 )$, we would not have been able to do so, since we do not know the distribution of $X$, but only that its mean is $112$ and its standard deviation is $40$. By contrast we could compute $P(\bar{X} > 113)$ even without complete knowledge of the distribution of $X$ because the Central Limit Theorem guarantees that $\bar{X}$ is approximately normal.

### Example 4

The numerical population of grade point averages at a college has mean $2.61$ and standard deviation $0.5$. If a random sample of size $100$ is taken from the population, what is the probability that the sample mean will be between $2.51$ and $2.71$?

Solution

The sample mean $\bar{X}$ has mean $\mu_{\bar{X}}=\mu=2.61$ and standard deviation $\sigma_{\bar{X}}=\sigma / \sqrt{n}=0.5 / 10=0.05$, so

\begin{aligned}P(2.51 < \bar{X} < 2.71) &=P\left(\frac{2.51-\mu_{X}}{\sigma_{X}} < Z < \frac{2.71-\mu_{X}}{\sigma_{X}}\right) \\&=P\left(\frac{2.51-2.61}{0.05} < Z < \frac{2.71-2.61}{0.05}\right) \\&=P(-2 < Z < 2) \\&=P(Z < 2)-P(Z < -2) \\&=0.9772-0.0228=0.9544\end{aligned}

### Normally Distributed Populations

The Central Limit Theorem says that no matter what the distribution of the population is, as long as the sample is "large," meaning of size $30$ or more, the sample mean is approximately normally distributed. If the population is normal to begin with then the sample mean also has a normal distribution, regardless of the sample size.

For samples of any size drawn from a normally distributed population, the sample mean is normally distributed, with mean $\mu_{X}=\mu$ and standard deviation $\sigma_{X}=\sigma / \sqrt{n}$, where $n$ is the sample size.

The effect of increasing the sample size is shown in Figure 6.4 "Distribution of Sample Means for a Normal Population".

Figure 6.4 Distribution of Sample Means for a Normal Population

### Example 5

A prototype automotive tire has a design life of $38,500$ miles with a standard deviation of $2,500$ miles. Five such tires are manufactured and tested. On the assumption that the actual population mean is $38,500$ miles and the actual population standard deviation is $2,500$ miles, find the probability that the sample mean will be less than $36,000$ miles. Assume that the distribution of lifetimes of such tires is normal.

Solution:

For simplicity we use units of thousands of miles. Then the sample mean $\bar{X}$ has mean $\mu_{\bar{X}}=\mu=38.5$ and standard deviation $\sigma_{\bar{X}}=\sigma / \sqrt{n}=2.5 / \sqrt{5}=1.11803$. Since the population is normally distributed, so is $\bar{X}$, hence

\begin{aligned}P(\bar{X} < 36) &=P\left(Z < \frac{36-\mu_{X}}{\sigma_{X}}\right) \\&=P\left(Z < \frac{36-38.5}{1.11803}\right) \\&=P(Z < -2.24)=0.0125\end{aligned}

That is, if the tires perform as designed, there is only about a $1.25 \%$ chance that the average of a sample of this size would be so low.

### Example 6

An automobile battery manufacturer claims that its midgrade battery has a mean life of $50$ months with a standard deviation of $6$ months. Suppose the distribution of battery lives of this particular brand is approximately normal.

a. On the assumption that the manufacturer's claims are true, find the probability that a randomly selected battery of this type will last less than $48$ months.
b. On the same assumption, find the probability that the mean of a random sample of $36$ such batteries will be less than $48$ months.

Solution

a. Since the population is known to have a normal distribution

\begin{aligned} P(X < 48) &=P\left(Z < \frac{48-\mu}{\sigma}\right)=P\left(Z < \frac{48-50}{6}\right) \\ &=P(Z < -0.33)=0.3707 \end{aligned}

b. The sample mean has mean $\mu_{X}=\mu=50$ and standard deviation $\sigma_{X}=\sigma / \sqrt{n}=6 / \sqrt{36}=1$. Thus

\begin{aligned}P(\bar{X} < 48) &=P\left(Z < \frac{48-\mu_{X}}{\sigma_{\bar{X}}}\right) \\&=P\left(Z < \frac{48-50}{1}\right) \\&=P(Z < -2)=0.0228\end{aligned}

### Key Takeaways

• When the sample size is at least $30$ the sample mean is normally distributed.
• When the population is normal the sample mean is normally distributed regardless of the sample size.

### Basic

1. A population has mean $128$ and standard deviation $22$.

a. Find the mean and standard deviation of $\bar{X}$ for samples of size $36$.
b. Find the probability that the mean of a sample of size $36$ will be within $10$ units of the population mean, that is, between $118$ and $138$.

3. A population has mean $73.5$ and standard deviation $2.5$.

a. Find the mean and standard deviation of $\bar{X}$ for samples of size $30$.
b. Find the probability that the mean of a sample of size $30$ will be less than $72$.

5. A normally distributed population has mean $25.6$ and standard deviation $3.3$.

a. Find the probability that a single randomly selected element $X$ of the population exceeds $30$.
b. Find the mean and standard deviation of $\bar{X}$ for samples of size $9$.
c. Find the probability that the mean of a sample of size $9$ drawn from this population exceeds $30$.

7. A population has mean $557$ and standard deviation $35$.

a. Find the mean and standard deviation of $\bar{X}$ for samples of size $50$.
b. Find the probability that the mean of a sample of size $50$ will be more than $570$.

9. A normally distributed population has mean $1,214$ and standard deviation $122$.

a. Find the probability that a single randomly selected element $X$ of the population is between $1,100$ and $1,300$.
b. Find the mean and standard deviation of $\bar{X}$ for samples of size $25$.
c. Find the probability that the mean of a sample of size $25$ drawn from this population is between $1,100$ and $1,300$.

11. A population has mean $72$ and standard deviation $6$.

a. Find the mean and standard deviation of $\bar{X}$ for samples of size $45$.
b. Find the probability that the mean of a sample of size $45$ will differ from the population mean $72$ by at least $2$ units, that is, is either less than $70$ or more than $74$. (Hint: One way to solve the problem is to first find the probability of the complementary event.)

### Applications

13. Suppose the mean number of days to germination of a variety of seed is $22$, with standard deviation $2.3$ days. Find the probability that the mean germination time of a sample of $160$ seeds will be within $0.5$ day of the population mean.

15. Suppose the mean amount of cholesterol in eggs labeled "large" is $186$ milligrams, with standard deviation $7$ milligrams. Find the probability that the mean amount of cholesterol in a sample of $144$ eggs will be within $2$ milligrams of the population mean.

17. Suppose speeds of vehicles on a particular stretch of roadway are normally distributed with mean $36.6$ mph and standard deviation $1.7$ mph.

a. Find the probability that the speed $X$ of a randomly selected vehicle is between $35$ and $40$ mph.
b. Find the probability that the mean speed $\bar{X}$ of $20$ randomly selected vehicles is between $35$ and $40$ mph.

19. Suppose the mean cost across the country of a $30$-day supply of a generic drug is $\ 46.58$, with standard deviation $\ 4.84$. Find the probability that the mean of a sample of $100$ prices of $30$-day supplies of this drug will be between $\ 45$ and $\ 50$.

21. Scores on a common final exam in a large enrollment, multiple-section freshman course are normally distributed with mean $72.7$ and standard deviation $13.1$.

a. Find the probability that the score $X$ on a randomly selected exam paper is between $70$ and $80$.
b. Find the probability that the mean score $\bar{X}$ of $38$ randomly selected exam papers is between $70$ and $80$.

23. Suppose that in a certain region of the country the mean duration of first marriages that end in divorce is $7.8$ years, standard deviation $1.2$ years. Find the probability that in a sample of $75$ divorces, the mean age of the marriages is at most $8$ years.

25. A high-speed packing machine can be set to deliver between $11$ and $13$ ounces of a liquid. For any delivery setting in this range the amount delivered is normally distributed with mean some amount $\mu$ and with standard deviation $0.08$ ounce. To calibrate the machine it is set to deliver a particular amount, many containers are filled, and $25$ containers are randomly selected and the amount they contain is measured. Find the probability that the sample mean will be within $0.05$ ounce of the actual mean amount being delivered to all containers.

1. a. $\mu_{X}=128, \sigma_{\bar{X}}=3.67$

b. $0.9936$

3. a. $\mu_{\bar{X}}=73.5, \sigma_{\bar{X}}=0.456$

b. $0.0005$

5. a. $0.0918$
b. $\mu_{\bar{X}}=25.6, \sigma_{\bar{X}}=1.1$

c. $0.0000$

7. a. $\mu_{X}=557, \sigma_{X}=4.9497$

b. $0.0043$

9. a. $0.5818$
b. $\mu_{\bar{X}}=1214, \sigma_{\bar{X}}=24.4$

c. $0.9998$

11. a. $\mu_{\bar{X}}=72, \sigma_{\bar{X}}=0.8944$

b. $0.0250$

13. $0.9940$

15. $0.9994$

17. a. $0.8036$

b. $1.0000$

19. $0.9994$

21. a. $0.2955$
b. $0.8977$

23. $0.9251$

25. $0.9982$