Cumulative Distribution Function

Cumulative Distribution Function#

The PMF is one way to describe the distribution of a discrete random variable. As we will see later on, PMF cannot be defined for continuous random variables. The cumulative distribution function (CDF) of a random variable is another method to describe the distribution of random variables. The advantage of the CDF is that it can be defined for any kind of random variable (discrete, continuous, and mixed) [Pishro-Nik, 2014].

The take away lesson here is that CDF is another way to describe the distribution of a random variable. In particular, in continuous random variables, we do not have an equivalent of PMF, so we use CDF instead.

Definition#

Definition 81 (Cumulative Distribution Function)

Let \(X\) be a discrete random variable with \(\S = \lset \xi_1, \xi_2, \ldots \rset\) where \(\xi_i \in \R\) for all \(i\). Note that \(X(\xi_i) = x_i\) for all \(i\) where \(x_i\) is the state of \(X\).

Then the cumulative distribution function \(\cdf\) is defined as

(176)#\[ \cdf(x_k) \overset{\text{def}}{=} \P \lsq X \leq x_k \rsq = \sum_{\ell=1}^k \P \lsq X = x_{\ell} \rsq = \sum_{\ell=1}^k \pmf(x_{\ell}) \]

Since \(\P \lsq X = x_{\ell} \rsq\) is the probability mass function, we can also replace the symbol with the \(\pmf\) symbol.

Example 29 (CDF)

Consider a random variable \(X\) with the following probability mass function:

\[\begin{split} \pmf(x) = \begin{cases} \frac{1}{4} & \text{if } x = 0 \\ \frac{1}{2} & \text{if } x = 1 \\ \frac{1}{4} & \text{if } x = 4 \\ \end{cases} \end{split}\]

Then by definition Definition 81, we have the CDF of \(X\) to be computed as:

\[\begin{split} \begin{align} \cdf(0) & = \P \lsq X \leq 0 \rsq = \P \lsq X = 0 \rsq = \frac{1}{4} \\ \cdf(1) & = \P \lsq X \leq 1 \rsq = \P \lsq X = 0 \rsq + \P \lsq X = 1 \rsq = \frac{1}{4} + \frac{1}{2} = \frac{3}{4} \\ \cdf(4) & = \P \lsq X \leq 4 \rsq = \P \lsq X = 0 \rsq + \P \lsq X = 1 \rsq + \P \lsq X = 4 \rsq = \frac{1}{4} + \frac{1}{2} + \frac{1}{4} = 1 \end{align} \end{split}\]

Thus, our CDF is given by:

\[\begin{split} \cdf(x) = \begin{cases} \frac{1}{4} & \text{if } x \leq 0 \\ \frac{3}{4} & \text{if } 0 < x \leq 1 \\ 1 & \text{if } x > 1 \end{cases} \end{split}\]
Hide code cell source
 1import warnings
 2
 3warnings.filterwarnings("ignore")
 4import numpy as np
 5import matplotlib.pyplot as plt
 6
 7p = np.array([0.25, 0.5, 0.25])
 8x = np.array([0, 1, 4])
 9F = np.cumsum(p)
10# plot 2 diagrams in one figure
11# y axis start from 0 to 1
12fig, ax = plt.subplots(1, 2, sharex=False, sharey=False, figsize=(10, 5))
13ax[0].set_ylim(0, 1)
14ax[0].set_title("PMF")
15ax[0].set_ylabel("Probability")
16ax[0].set_xlabel("x")
17ax[0].stem(x, p)
18ax[0].grid(False)
19ax[1].set_ylim(0, 1)
20ax[1].set_title("CDF")
21ax[1].set_ylabel("Probability")
22ax[1].set_xlabel("x")
23ax[1].step(x, F)
24ax[1].grid(False)
25plt.show()
../../_images/a5f6cd85328fdfed6bb5aa8b33ceb2ad377049a2990ab637e5c3f277daba0c6c.png

Properties#

Theorem 18 (Properties of CDF)

Let \(X\) be a discrete random variable with \(\S = \lset \xi_1, \xi_2, \ldots \rset\) where \(\xi_i \in \R\) for all \(i\). Then, the CDF \(\cdf\) of \(X\) satisfies the following properties:

  1. The CDF is a staircase function and is non-decreasing. That is, for any \(\xi \in \S\), we have

    \[ \cdf(x) \leq \cdf(x+1) \]
  2. The CDF is a probability function.

    \[ 0 \leq \cdf(x) \leq 1 \]

    In particular, we have the minimum of the CDF is 0 and the maximum is 1 for \(x = -\infty\) and \(x = \infty\) respectively.

  3. The CDF is right continuous.

PMF and CDF Conversion#

Theorem 19 (PMF and CDF Conversion)

Let \(X\) be a discrete random variable with \(\S = \lset \xi_1, \xi_2, \ldots \rset\) where \(\xi_i \in \R\) for all \(i\). Note that \(X(\xi_i) = x_i\) for all \(i\) where \(x_i\) is the state of \(X\). Then, the PMF of \(X\) can be obtained from the CDF by

(177)#\[ \pmf(x_k) = \cdf(x_k) - \cdf(x_{k-1}) \]

where \(X\) has a countable set of states \(\S\).