# The Difference between Two Means

## Computations for Unequal Sample Sizes (optional)

The calculations are somewhat more complicated when the sample sizes are not equal. One consideration is that MSE, the estimate of variance, counts the group with the larger sample size more than the group with the smaller sample size. Computationally, this is done by computing the sum of squares error (SSE) as follows:

$S S E=\sum\left(X-M_{1}\right)^{2}+\sum\left(X-M_{2}\right)^{2}$

where $M_{1}$ is the mean for group 1 and $M_{2}$ is the mean for group 2 . Consider the following small example:

Table 4. Unequal $n$.

Group 1 Group 2
3 2
4 4
5

$M_{1}=4 \text { and } M_{2}=3$

$\mathrm{SSE}=(3-4)^{2}+(4-4)^{2}+(5-4)^{2}+(2-3)^{2}+(4-3)^{2}=4$

Then, MSE is computed by: MSE $=\mathrm{SSE} / \mathrm{df}$

where the degrees of freedom (df) is computed as before: $\mathrm{df}=\left(\mathrm{n}_{1}-1\right)+\left(\mathrm{n}_{2}-1\right)=(3-1)+(2-1)=3$ MSE $=S S E / d f=4 / 3=1.333$

The formula

The formula

$s_{M_{1}-M_{2}}=\sqrt{\frac{2 M S E}{n}}$

is replaced by

$s_{M_{1}-M_{2}}=\sqrt{\frac{2 M S E}{n_{h}}}$

where $n_{h}$ is the harmonic mean of the sample sizes and is computed as follows:

$\mathrm{n}_{\mathrm{h}}=\dfrac{2}{1 / n_{1}+1 / n_{2}}=\dfrac{2}{1 / 3+1 / 2}=2.4$

and

$s_{M_{1}-M_{2}}=\sqrt{\dfrac{(2)(1.333)}{2.4}}=1.054$

Therefore,

$t=(4-3) / 1.054=0.949$

and the two-tailed $\mathrm{p}=0.413$.

##### R code

Data file
t.test(data$WRONG ~ data$GENDER,var.equal=TRUE)
data: data$WRONG by data$GENDER