Unit 3: Sampling Distributions
The concept of sampling distribution lies at the very foundation of statistical inference. It is best to introduce sampling distribution using an example here. Suppose you want to estimate a parameter of a population, say the population mean. There are two natural estimators: 1. sample mean, which is the average value of the data set; and 2. median, which is the middle number when the measurements are arranged in ascending (or descending) order. In particular, for a sample of even size n, the median is the mean of the middle two numbers. But which one is better, and in what sense? This involves repeated sampling, and you want to choose the estimator that would do better on average. It is clear that different samples may give different sample means and medians; some of them may be closer to the truth than the others. Consequently, we cannot compare these two sample statistics or, in general, any two sample statistics on the basis of their performance with a single sample. Instead, you should recognize that sample statistics are themselves random variables; therefore, sample statistics should have frequency distributions by taking into account all possible samples. In this unit, you will study the sampling distribution of several sample statistics. This unit will show you how the central limit theorem can help to approximate sampling distributions in general.
Completing this unit should take you approximately 15 hours.
End of Unit Assessment