Measures of Central Location

This section elaborates on mean, median, and mode at the population level and sample level. This section also contains many interesting examples of range, variance, and standard deviation. Complete the exercises and check your answers.

Measures of Central Location

The Mean

The first measure of central location is the usual "average" that is familiar to everyone. In the formula in the following definition we introduce the standard summation notation \Sigma, where \Sigma is the capital Greek letter sigma. In general, the notation \Sigma followed by a second mathematical symbol means to add up all the values that the second symbol can take in the context of the problem. Here is an example to illustrate this.


EXAMPLE 1

Find \Sigma x, \Sigma x^{2}, and \Sigma(x-1)^{2} for the data set

 1 \quad 3 \quad 4 


Solution:

\begin{aligned} \Sigma x &=1+3+4=8 \\ \Sigma x^{2} &=1^{2}+3^{2}+4^{2}=1+9+16=26 \\ \Sigma(x-1)^{2} &=(1-1)^{2}+(3-1)^{2}+(4-1)^{2}=0^{2}+2^{2}+3^{2}=13 \end{aligned}

In the definition we follow the convention of using lowercase \mathrm{n} to denote the number of measurements in a sample, which is called the sample size.


Definition

The sample mean of a set of n sample data is the number \bar{x} defined by the formula

\bar{x}=\frac{\Sigma x}{n}


EXAMPLE 2

Find the mean of the sample data

2 \quad −1 \quad 0 \quad 2


Solution:

\bar{x}=\frac{\Sigma x}{n}=\frac{2+(-1)+0+2}{4}=\frac{3}{4}=0.75


EXAMPLE 3

A random sample of ten students is taken from the student body of a college and their GPAs are recorded as follows.

\begin{array}{llllllllll}1.90 & 3.00 & 2.53 & 3.71 & 2.12 & 1.76 & 2.71 & 1.39 & 4.00 & 3.33\end{array}

Find the sample mean.


Solution:

\begin{aligned} \bar{x} &=\frac{\Sigma x}{n}=\frac{1.90+3.00+2.53+3.71+2.12+1.76+2.71+1.39+4.00+3.33}{10} \\ &=\frac{26.45}{10}=2.645 \end{aligned}



EXAMPLE 4

A random sample of \mathrm{19} women beyond child-bearing age gave the following data, where x is the number of children and f is the frequency of that value, the number of times it occurred in the data set.

 \begin{array}{l|lllllll} x & 0 & 1 & 2 & 3 & 4 \\ \hline f & 3 & 6 & 6 & 3 & 1 \end{array} 

Find the sample mean.


Solution:

In this example the data are presented by means of a data frequency table. Each number in the first line of the table is a number that appears in the data set; the number below it is how many times it occurs. Thus the value \mathrm{0} is observed three times, that is, three of the measurements in the data set are \mathrm{0}, the value \mathrm{1} is observed six times, and so on. In the context of the problem this means that three women in the sample have had no children, six have had exactly one child, and so on. The explicit list of all the observations in this data set is therefore

\begin{array}{lllllllllllllllllll}0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 2 & 2 & 2 & 2 & 2 & 2 & 3 & 3 & 3 & 4\end{array}

The sample size can be read directly from the table, without first listing the entire data set, as the sum of the frequencies: n=3+6+6+3+1=19. The sample mean can be computed directly from the table as well:

\bar{x}=\frac{\Sigma x}{n}=\frac{0 \times 3+1 \times 6+2 \times 6+3 \times 3+4 \times 1}{19}=\frac{31}{19}=1.6316

In the examples above the data sets were described as samples. Therefore the means were sample means, denoted by \bar{x}. If the data come from a census, so that there is a measurement for every element of the population, then the mean is calculated by exactly the same process of summing all the measurements and dividing by how many of them there are, but it is now the population mean and is denoted by \mu, the lower case Greek letter mu.


Definition

The population mean of a set of \mathrm{N} population data is the number μ defined by the formula

\mu=\frac{\Sigma x}{N}

The mean of two numbers is the number that is halfway between them. For example, the average of the numbers 5 and 17 is (5+17) / 2=11, which is \mathrm{6} units above \mathrm{5} and \mathrm{6} units below \mathrm{17}. In this sense the average \mathrm{11} is the "center" of the data set \{5,17\}. For larger data sets the mean can similarly be regarded as the "center" of the data.