## Numerical Measures of Central Tendency and Variability

Read these sections and complete the questions at the end of each section. First, we will define central tendency and introduce mean, median, and mode. We will then elaborate on median and mean and discusses their strengths and weaknesses in measuring central tendency. Finally, we'll address variability, range, interquartile range, variance, and the standard deviation.

### Measures of Variability

#### Variance

Variability can also be defined in terms of how close the scores in the distribution are to the middle of the distribution. Using the mean as the measure of the middle of the distribution, the variance is defined as the average squared difference of the scores from the mean. The data from Quiz 1 are shown in Table 1. The mean score is $\mathrm{7.0}$. Therefore, the column "Deviation from Mean" contains the score minus $\mathrm{7}$. The column "Squared Deviation" is simply the previous column squared.

Table 1. Calculation of Variance for Quiz 1 scores.

Scores Deviation from Mean Squared Deviation
9 2 4
9 2 4
9 2 4
8 1 1
8 1 1
8 1 1
8 1 1
7 0 0
7 0 0
7 0 0
7 0 0
7 0 0
6 -1 1
6 -1 1
6 -1 1
6 -1 1
6 -1 1
6 -1 1
5 -2 4
5 -2 4
Means
7 0 1.5

One thing that is important to notice is that the mean deviation from the mean is 0. This will always be the case. The mean of the squared deviations is $\mathrm{1.5}$. Therefore, the variance is $\mathrm{1.5}$. Analogous calculations with Quiz 2 show that its variance is $\mathrm{6.7}$. The formula for the variance is:

$\sigma^{2}=\frac{\sum(X-\mu)^{2}}{N}$

where $\sigma^{2}$ is the variance, $\mu$ is the mean, and $N$ is the number of numbers. For Quiz $1, \mu=7$ and $N=20$

If the variance in a sample is used to estimate the variance in a population, then the previous formula underestimates the variance and the following formula? should be used:

$s^{2}=\frac{\sum(X-M)^{2}}{N-1}$

where $s^{2}$ is the estimate of the variance and $M$ is the sample mean. Note that $M$ is the mean of a sample taken from a population with a mean of $\mu$. Since, in practice, the variance is usually computed in a sample, this formula is most often used. The simulation "estimating variance" illustrates the bias in the formula with $N$ in the denominator.

Let's take a concrete example. Assume the scores $1,2,4$, and 5 were sampled from a larger population. To estimate the variance in the population you would compute $s^{2}$ as follows:

\begin{aligned} M &=(1+2+4+5) / 4=12 / 4=3 \\ S^{2} &=\left[(1-3)^{2}+(2-3)^{2}+(4-3)^{2}+(5-3)^{2}\right] /(4-1) \\ &=(4+1+1+4) / 3=10 / 3=3.333 \end{aligned}

There are alternate formulas that can be easier to use if you are doing your calculations with a hand calculator. You should note that these formulas are subject to rounding error if your values are very large and/or you have an extremely large number of observations.

$\sigma^{2}=\frac{\sum X^{2}-\frac{\left(\sum X\right)^{2}}{N}}{N}$

and

$s^{2}=\frac{\sum X^{2}-\frac{\left(\sum X\right)^{2}}{N}}{N-1}$

For this example,

\begin{aligned} &\sum X^{2}=1^{2}+2^{2}+4^{2}+5^{2}=46 \\ &\frac{\left(\sum X\right)^{2}}{N}=\frac{(1+2+4+5)^{2}}{4}=\frac{144}{4}=36 \\ &\sigma^{2}=\frac{(46-36)}{4}=2.5 \\ &s^{2}=\frac{(46-36)}{3}=3.333 \text { as with the other formula } \end{aligned}