Stationary distribution defined using Riemann-Stieltjes integrals

Question

Apologies for a long post ahead. I encountered a theorem from a 1975 paper (Theorem 1) on the existence of a unique stationary limiting distribution, defined using a sequence of monotone non-decreasing random maps on the unit interval:

$$ F_0(x) = \int_0^1 H_y(x)dF_0(y). $$ If my understanding is correct, the sequence of distribution functions $H_y^n$ converges in distribution to $F_0$ (weak convergence). The distribution function is defined as $H_y^n(x) = P(X_n(y) \leq x)$ for some $x,y \in [0,1]$, where $X_n(y) := f_{\alpha_1}(f_{\alpha_2}(...f_{\alpha_n}(y)))$, such that each $f_{\alpha_i} : [0,1] \rightarrow [0,1]$ is monotone non-decreasing, and $\{\alpha_i\}$ is an i.i.d. sequence of indices from an arbitrary index set. The integrand $H_y$ (without superscript $n$) is just the distribution function of a monotone non-decreasing $f_{\alpha}:[0,1]\rightarrow [0,1]$ where $\alpha$ is from the same index set. The limiting distribution $F_0(x)$ is defined as $\lim_{n\rightarrow \infty}P(X_n(0) \leq x)$.

This theorem is proved using the Helly-Bray Lemma and the integral above is a Riemann-Stieltjes integral, but I think I lack the fundamental understanding of how this "definition" of a stationary distribution even works, since most of the sources I can find on stationary distribution are in the context of Markov chains, and those discussions seem completely different from this. Could anyone direct me to a good source on studying this? I found some nice material on Riemann-Stieltjes integrals but can't connect the dots to understand this theorem.

dandar · Answer 1 · 2021-04-26T23:24:01.960

Tentative answer

@user594147: Not sure at all but in the absence of any answers I put forward a few thoughts and, as you ask, a reference which might help. I would be absolutley fine with somebody more expert correcting my "answer".

Page 363, Patrick Billingsley, Probability and Measure (section "Dependent variables") looks close to a good start. He says if the law for $(X_{n},X_{n+1},...,X_{n+j})$ does not depend on $n$ then the sequence of random variables is defined to be stationary. Thus the law only depends on $j$ (the number of random variables in the vector) and not where in the sequence the $j$ variables are taken from. But perhaps you knew this and your question pertained to the inner workings of the maths?

The $\alpha$'s in your question appear in Billinglsey as follows. For $k\geq1$ and $n\geq 1$ let $A\in\sigma(X_{1},...,X_{k}):=\mathscr{X}_{k}$ and $B\in\sigma(X_{k+n},X_{k+n+1},...):=\mathscr{X}_{k+n}$ (these are the sigma fields generated by the two random vectors and I have added my own notation for them). Let $\alpha_{n}$ be a number such that

$$\left|P(A\cap B)-P(A)P(B)\right|\leq\alpha_{n}$$

If $\alpha_{n}\rightarrow 0$ then the above satisfies the defintion of independent $\sigma$-fields, that is $\mathscr{X}_{k}$ is independent of $\mathscr{X}_{k+n}$ for large enough $n$. In this case the distribution of the vector $(X_{k+n},X_{k+n+1},...)$ sort of "forgets" where it started from in the sense of being independent from the vector $(X_{1},...,X_{k})$. So here is a more measure-theroetic defintion of stationarity sequences perhaps - i.e asyptotic independence?

If the asymptotic independence holds (I will continue to call it this) Billinglsey describes the sequence $\{X_{1},X_{2},...\}$ to be $\alpha$-mixing. I am guessing that your functions $f_{\alpha_{1}},f_{\alpha_{2}},..$ etc might relate to monotonically increasing $\sigma$-fields since $\mathscr{X}_{k}\subseteq\mathscr{X}_{k+1}\subseteq\mathscr{X}_{k+2},...$ and $\mathscr{X}_{k+n}\subseteq\mathscr{X}_{k+n+1}\subseteq\mathscr{X}_{k+n+2},...$ satisfy this.

Update

It is a shame nobody more knowledgeable than me has stepped forward to answer your question, so I will give my final thoughts on this (for now), and for what this is worth just in case this helps you do further research.

The link between stationary processes (stochastic processes with stationary distributions in the sense discussed as per the Billingsley definition) and ergodic processes (stochastic processes whose averages over a sample path converge to the expectation of the random path if the observed path is long enough) seem to be relevant - see here https://dsp.stackexchange.com/questions/1167/what-is-the-distinction-between-ergodic-and-stationary for an excellent discussion. I say this because ergodic processes and measure preserving maps seem to be linked - see here https://link.springer.com/chapter/10.1007%2F978-1-4899-0018-0_6, and the notation used on page 606 of the paper just "looks like" the recursive definitions of repeated applications of a measure-preserving map, although this is not mentioned. Furthermore the reference here http://www.columbia.edu/~ks20/6712-14/6712-14-Notes-Ergodic.pdf about ergodic processes and the link to stationary processes is good. This also mentions mixtures of two distributions and how this represents (or not) processes that are both stationary and ergodic (this type of thing seemed to come up in Choquet's theorem). So... lots of interesting things that look all linked to me and candidates for being relevant to your problem, but I lack the knowledge to pull this together for you. Maybe I am over-complicating this! Do post any updates if you find out more.

Thanks very much for your reply, but I don't think this case, especially the interpretations of the $\alpha_n$, is relevant to the context of my question. The $\alpha_i$'s in my question are indices, which form a sequence of i.i.d. elements taken from this arbitrary index set $\mathcal{A}$. (continued below) — user594147, Apr 22 '21 at 04:16
So you can think of the sequence ${X_n}$ as a sequence of i.i.d. random non-decreasing maps on the unit interval, sort of an autonomous stochastic process if it helps (my background is in systems control so that's the language I'd use) — user594147, Apr 22 '21 at 04:18
What you said at the beginning about "the law for $(X_n,X_{n+1},...,X_{n+j})$ does not depend on $n$" reminds me of a related question: a particular case is if $j=0$, then we can say that every random variable in the sequence of random variables ${X_n}$ has the same distribution function $F_X$, thus $F_X$ is a stationary distribution? — user594147, Apr 22 '21 at 04:19
Ahh I see I was wrong about the sigma-fields (I now looked at the paper you gave). In the same Billingsley reference (p 336) he has something called the Helly's Theorem, and in another Billingsley reference (Convergence of probability measures) he refers to that same theorem as Helley's selection theorem. When I search online for this I came across https://math.stackexchange.com/questions/397931/hellys-selection-theorem, which appears vaguely relevant to the proof in your paper? — dandar, Apr 22 '21 at 08:05
The Helly-Bray Lemma comes from Probability Theory I by Loeve, p. 182 Section 11.3. I think that part isn't giving me so much trouble, but if you saw the paper I referenced, the question I'd like to have answered is what exactly is the integral in Theorem 1, where it said "we would like to show that $F_0(x) = \int_0^1 H_y(x)dF_0(y)$". Why does it mean that $F_0$ is a stationary distribution if this integral holds? This is a very different "definition" (if it can be treated as one) from the stationary distributions you mentioned above. — user594147, Apr 22 '21 at 22:01
@user594147 I see your point. It is difficult to connect this to the standard definitions. I did some searching for stationarity and probability measures, and as you say, keep getting markov processes. I did find a mention here \url{https://mathoverflow.net/questions/142617/reference-request-stationary-measures-as-convex-combinations-of-ergodic-measure} that invariant measures are stationary distributions in some settings. But what caught my eye is the question posed: are stationary distributions representable in some way only by their extreme points. — dandar, Apr 23 '21 at 17:49
The Theorem you mention is only concerned with extreme points $y=0$ and $y=1$ - i.e. the integrals represent eq 2.1 and 2.2. Maybe it is this kind of theory the paper is rooted in? The aforementioned web page mentions Choquet's theorem, and this reference looks to be a good introduction (although the maths looks hard, the commentary makes interesting reading) — dandar, Apr 23 '21 at 17:54
\url{https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwizsaOl9ZTwAhUHTRUIHYkdB0gQFjABegQIBhAD&url=https%3A%2F%2Fwww.stat.berkeley.edu%2F~yassine.el-maazouz%2Fcontent%2Fdocuments%2FChoquet_Theorem.pdf&usg=AOvVaw0pT6krq8b21X9KS5Ue4rNO} — dandar, Apr 23 '21 at 17:54
Thanks so much for the followups, and sorry it took me a while to reply. The discussions and the document you linked indeed seem beyond my very limited knowledge, but I'll skim through them and see if I can understand them on the big picture level. On a related note, if you have access to Handbook of Stochastic Analysis and Applications, in the first chapter by R. Bhattacharya, Equation 1.1.2 in the introduction regarding the invariant probability seems similar to what I'm looking for here... — user594147, Apr 25 '21 at 06:00

Stationary distribution defined using Riemann-Stieltjes integrals

1 Answers1