Concept#
Sample Average#
Definition 168 (Sample Average)
The law of large numbers is a probabilistic statement about the sample average. Suppose that we have a collection of i.i.d. random variables \(X_1, \ldots, X_N\) the sample average of these \(N\) random variables is defined as follows:
Theorem 68 (Expectation of Sample Average)
In itself, the sample average \(\bar{X}\) is a random variable. Therefore, we can compute its expectation.
If the random variables \(X_1, \ldots, X_N\) are i.i.d. so that they have the same population mean \(\mathbb{E}\left[X_n\right]=\mu\) (for \(n=1, \ldots, N\) ), then by the linearity of the expectation,
Thus, the expectation of the sample average is the same as the expectation of the population if the random variables are i.i.d.
Theorem 69 (Variance of Sample Average)
As with any random variables, we can check the uncertainty of the sample average by computing its variance.
If \(X_{1}, \ldots, X_{N}\) are i.i.d. random variables with the same variance \(\operatorname{Var}\left[X_{n}\right]=\sigma^{2}(\) for \(n=1, \ldots, N)\), then
We easily see that as \(N \rightarrow \infty\), the variance of the sample average goes to zero.
Example of Convergence#
In the previous section, we see that as \(N\) grows, the variance of the sample average goes to zero. In other words, what this really means is as \(N\) increases, there will be less deviation of the sample average from the population mean. Let’s see this in action in the notebook here.
Weak Law of Large Numbers#
Theorem 70 (Weak Law of Large Numbers)
Let \(X_1, \ldots, X_N\) be \(\iid\) random variables with common mean \(\mu\) and variance \(\sigma^2\). Each \(X\) is distributed by the same probability distribution \(\mathbb{P}\).
Let \(\bar{X}\) be the sample average defined in (184) and \(\mathbb{E}[X^2] < \infty\).
Then, for any \(\epsilon > 0\), we have
This means that
In other words, as sample size \(N\) grows, the probability that the sample average \(\bar{X}\) differs from the population mean \(\mu\) by more than \(\epsilon\) approaches zero. Note this is not saying that the probability of the difference between the sample average and the population mean is more than epsilon is zero, the expression is the probability that the difference is more than epsilon! So in laymen terms, as \(N\) grows, then it is guaranteed that the difference between the sample average and the population mean is no more than \(\epsilon\). This seems strong since \(\epsilon\) can be arbitrarily small, but it is still a probability bound.
Strong Law of Large Numbers#
Theorem 71 (Strong Law of Large Numbers)
Let \(X_1, \ldots, X_N\) be \(\iid\) random variables with common mean \(\mu\) and variance \(\sigma^2\).
Let \(\bar{X}\) be the sample average defined in (184) and \(\mathbb{E}[X^4] < \infty\).
Then, we have,
This means that
Further Readings#
- Chan, Stanley H. “Chapter 6.3. Law of Large Numbers.” In Introduction to Probability for Data Science. Ann Arbor, Michigan: Michigan Publishing Services, 2021. 
- Pishro-Nik, Hossein. “Chapter 7.1.1. Law of Large Numbers.” In Introduction to Probability, Statistics, and Random Processes. Kappa Research, 2014. 
