Convergence in distribution: what does it mean to take the limit of a sequence of CDFs?

Question

I'm somewhat confused about the definition of convergence of random variables in distribution.

The sequence of random variables $X_n$ converges in distribution to $X$ if $\lim\limits_{x \to \infty} F_n(x)=F(x)$.

I don't understand what this limit means. What does it mean to take a limit of a large number of distributions? In the case of the law of large numbers it makes sense: you take the result of a large number of independent random variables, sum them, and then divide by $n$. But what does it mean to take the limit of CDFs like this? Any explanation would really be appreciated.

You have a bunch of CDFs, you're looking at all of them at some particular point $x$, and you ask that the values at $x$ converge to some value. Then you ask that this happens at every $x$. This could be the setting of the law of large numbers or some other setting like the central limit theorem. — Ian, Nov 08 '21 at 18:37
As a concrete example, if $G_n$ is the CDF of a binomial($n,1/2$) random variable and $F_n(x)=G_n \left ( \frac{x-n/2}{\sqrt{n/4}} \right )$ then $F_n(x)$ converges to the standard normal CDF $\Phi(x)$. — Ian, Nov 08 '21 at 18:39
@Ian "the values at x converge to some value", meaning, for example, at $x=1$ the average of all the values of the CDFs is equal to the value of $F(1)$? — fmtcs, Nov 08 '21 at 18:43
Not the average, just the limit of the values themselves. Each $F_n(x)$ is just a number. — Ian, Nov 08 '21 at 18:44
@Ian Alright, so at $F_1(x)$ there is a value for each $x$ that makes a CDF. Then $F_2(x)$ may be a little more like $F(x)$. By the time you get to $F_n$, if $F_n$ is the same as $F(x)$, then the sequence of random variables converges to $F(x)$? I guess I am just confused about when you would have a sequence of random variables that would converge like that, but I think an example would be the sequence of Poissons for each $\lambda$, and as lambda grows the CDF becomes that of the normal, am I right about that? — fmtcs, Nov 08 '21 at 19:03
They aren't necessarily going to be the same at any finite $n$, but that's the general idea, the values of the CDFs approach the values of the "target" CDF. However your Poisson case does not work; the Poisson approaches the normal for large $\lambda$ only after shifting and rescaling, similar to my binomial example. This is a really subtle thing about the central limit theorem. — Ian, Nov 08 '21 at 19:05

score 0 · Answer 1 · answered Nov 08 '21 at 21:12

By definition it means for all $x\in \mathbb{R}$, that $$\mathbb{P}(X_n \leq x) \rightarrow \mathbb{P}(X \leq x) \quad \text{ for $n\rightarrow \infty $}.$$ Or more concretely, that there for any $\epsilon >0$ exists $N\in \mathbb{N}$ such that $$|\mathbb{P}(X_n \leq x) - \mathbb{P}(X \leq x)| < \epsilon \quad \text{ for all $n\geq N$.} \tag{1}$$

This is very useful. For instance if we want to compute $\mathbb{P}(a<X_n \leq b)$ for some large $n$, then we can note that \begin{align*} \mathbb{P}(a<X_n \leq b) &=F_n(b)-F_n(a) \\ &= F(b)-F(a) +\underbrace{(F_n(b)-F(b)) +(F(a)-F_n(a))}_{\text{error term}} \end{align*} Where the error term can be made arbitrarily small because of $(1)$. For example consider $X_n \sim \operatorname{Poisson}(n)$, then it is well known that $\frac{X_n - n}{\sqrt{n}}$ converges in distribution towards $N(0,1)$, so for large $n$ we have that $$\mathbb{P}(a < \frac{X_n - n}{\sqrt{n}} \leq b) \approx \Phi(b)-\Phi(a) = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-\frac{x^2}{2}} \: dx$$ In fact we can even conclude \begin{align*}\mathbb{P}(a < X_n \leq b) &= \mathbb{P}(\frac{a-n}{\sqrt{n}} < \frac{X_n - n}{\sqrt{n}}\leq \frac{b- n}{\sqrt{n}}) \\ &\approx \Phi(\frac{b-n}{\sqrt{n}}) - \Phi(\frac{a-n}{\sqrt{n}}) \\ &= \frac{1}{\sqrt{2\pi n}}\int_a^b e^{-\frac{(x-n)^2}{2n}} \: dx, \end{align*} which does mean that the $\operatorname{Poisson}(n)$ distribution is close to the $N(n,n)$ distribution for large $n$.

Without shifting and rescaling you don't really have the result you think you have, because without shifting and rescaling you basically just have $0 \approx 0$. Indeed without shifting and rescaling what you really get is just $\overline{X} \approx \mu$, and you need rescaling to expose the normal distribution as a correction to this approximation. — Ian, Nov 10 '21 at 04:03
@Ian I get what you are saying, and can see the hole in my logic, since clearly $\mathbb{P}(X_n \leq x) \rightarrow 0$ for $n\rightarrow \infty$ for all $x$. The fact that $N(n,n)$ is a good approximation for $\operatorname{Poisson}(n)$ does however hold and to make sense of the claim you need to use the fact that the $CDF$ converges not only pointwise but uniformly, which can be used to show that $$\sup_{a,b \in \mathbb{R}} | \mathbb{P}(a < X_n \leq b) - \frac{1}{\sqrt{2\pi n}} \int_a^b \exp(-\frac{(x-n)^2}{2n}) : dx| \rightarrow 0 $$ for $n\rightarrow \infty$. — Leander Tilsted Kristensen, Nov 10 '21 at 10:01
see: https://math.stackexchange.com/questions/1670030/convergence-in-law-implies-uniform-convergence-of-cdfs — Leander Tilsted Kristensen, Nov 10 '21 at 10:06

Convergence in distribution: what does it mean to take the limit of a sequence of CDFs?

1 Answers1