Unit 3: Sampling Distributions
Section outline
-
The concept of sampling distribution lies at the very foundation of statistical inference. It is best to introduce sampling distribution using an example here. Suppose you want to estimate a population parameter, say the population mean. There are two natural estimators: 1. sample mean, which is the average value of the data set; and 2. median, which is the middle number when the measurements are arranged in ascending (or descending) order. In particular, for a sample of even size n, the median is the mean of the middle two numbers. But which one is better, and in what sense? This involves repeated sampling, and you want to choose the estimator that would do better on average.
Different samples may give different sample means and medians; some may be closer to the truth than the others. Consequently, we cannot compare these two sample statistics or, in general, any of them based on their performance with a single sample. Instead, you should recognize that sample statistics are random variables; therefore, they should have frequency distributions by considering all possible samples. In this unit, you will study the sampling distribution of several sample statistics. This unit will show you how the central limit theorem can help to approximate sampling distributions in general.
Completing this unit should take you approximately 2 hours.