# The Difference between Two Means

## Computations for Unequal Sample Sizes (optional)

The calculations are somewhat more complicated when the sample sizes are not equal. One consideration is that MSE, the estimate of variance, counts the group with the larger sample size more than the group with the smaller sample size. Computationally, this is done by computing the sum of squares error (SSE) as follows:

where is the mean for group 1 and is the mean for group 2 . Consider the following small example:

Group 1 | Group 2 |
---|---|

3 | 2 |

4 | 4 |

5 |

where the degrees of freedom (df) is computed as before: MSE

The formula

The formula

is replaced by

where is the harmonic mean of the sample sizes and is computed as follows:

and

Therefore,

##### R code

Data file

data=read.csv(file="animal.csv")

t.test(data$WRONG ~ data$GENDER,var.equal=TRUE)

Two Sample t-test

data: data$WRONG by data$GENDER

t = 2.5335, df = 32, p-value = 0.01639

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.2882231 2.6529534

sample estimates:

mean in group 1 mean in group 2

5.352941 3.882353