Probability that the sum of $n$ unit vectors (2D dimention) has length less than one

Question

It is easy to show that the probability that the length of the sum of two random unit vector in a 2D-plane is less than one is $\frac13$. As the picture above, assume the first vector is $\bar{AB}$ (by symmetricity, the direction of first vector could be any), and the second vector is any radius of circle B. So only when the head of the 2nd vector is in the red arc, the length of sum vector is less than 1. So the probability is $\frac13$.

Using a computer, it seems that the probability of the length of the sum of n random unit vectors being less than one is $\frac1{n+1}$.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
int n=5;
int m=10000;
int main(int argc, char argv[]){
        if(argc>=2){
            n=atoi(argv[1]);
        }
        if(argc>=3){
                m=atoi(argv[2]);
        }
        srand(time(NULL));
        printf("Test %d jump (total %d times)\n",n,m);
        int i,j;
        int c=0;
        for(i=0;i<m;i++){
                double sx=0.0,sy=0.0;
                for(j=0;j<n;j++){
                        double t=rand();
                        sx+=cos(t);
                        sy+=sin(t);
                }
                if(sxsx+sy*sy<=1.0){
                        c++;
                }
        }
        printf("%d times in circle (ratio %f)\n", c, (double)c/m);
        return 0;
}
$ ./sa 5 1000000
Test 5 jump (total 1000000 times)
166728 times in circle (ratio 0.166728)
$ ./sa 5 10000000
Test 5 jump (total 10000000 times)
1667138 times in circle (ratio 0.166714)
$ ./sa 7 10000000
Test 7 jump (total 10000000 times)
1248881 times in circle (ratio 0.124888)
./sa 7 10000000
Test 7 jump (total 10000000 times)
1249432 times in circle (ratio 0.124943)
./sa 9 10000000
Test 9 jump (total 10000000 times)
1001222 times in circle (ratio 0.100122)
./sa 9 10000000
Test 9 jump (total 10000000 times)
999595 times in circle (ratio 0.099960)

That's, given n i.i.d random variables $x_k$ which are uniform distributed in [0,1), could we prove that $P(\left|\sum_{k=1}^n \exp(2\pi i x_k)\right|<1)=\frac1{n+1}$.

Please, edit your post to show how you proved the case $n=2,$ and present your attempt(s) for the general case. — Anne Bauval, Jan 09 '23 at 14:13
It seems you forgot to mention that your unit vectors are two-dimensional – this is merely implicit in the complex reformulation you provide at the end of the question. — joriki, Jan 09 '23 at 21:01
https://math.stackexchange.com/questions/3085442/challenging-probability-problem-amc-12b-problem-18-are-the-aops-solutions-in — user51547, Jan 10 '23 at 02:37
Writing an answer, but it is long and might take more than a day, hopefully will be up soon. — Sarvesh Ravichandran Iyer, Jan 10 '23 at 09:17

achille hui · Answer 1 · 2023-01-15T09:38:53.230

I will assume we are taking about unit vectors in $2$-dim. In that case, the statement ~~seems to be~~ is true (see alternate approach for a proof)

Original approach using Bessel function.

My argument is sort of hand waving as it involves Fourier transform of distribution which I'm not familiar with. Please take whatever I say here with a grain of salt.

For any $\rho > 0$, let

$D_\rho = \{ x \in \mathbb{R}^2 : |x| \le \rho \}$ be the closed unit disk centered at origin.
$f_\rho = \mathbb{1}_{D_\rho}$ be identity function of $D_\rho$.
$g_\rho$ be the probability density function (as a distribution) of picking a point from circle $\partial D_\rho$.

The probability we seek, let's call it $P$, is the value of convolution of $f_1$ with $n$ copies of $g_1$ evaluated at origin.

$$P = (f_1 * (\underbrace{g_1 * \cdots * g_1}_{n\text{ copies}}))(0)$$

With help of convolution theorem, this can be expressed as a integral of $f_1$ and $g_1$'s Fourier transform:

$$P = \frac1{(2\pi)^2}\int_{\mathbb{R}^2} \hat{f}_1(k)\hat{g}_1(k)^n dk \tag{*1} $$ Using polar coordinates in $x$ ($r = |x|$ here), we have

$$\hat{f}_\rho(k) = \int_0^\rho \left[\int_0^{2\pi} e^{-i|k| r\cos\theta} d\theta\right] r dr = 2\pi \int_0^\rho J_0(|k|r) r dr\\ = \frac{2\pi}{|k|^2} \int_0^{|k|\rho} tJ_0(t) dt = \frac{2\pi\rho}{|k|} J_1(|k|\rho) $$ In certain sense, one can view $g_\rho$ as a limit of differences of $f_\rho$:

$$g_\rho = \lim_{\epsilon\to 0+} \frac{f_{\rho + \epsilon} - f_{\rho - \epsilon}}{\pi(\rho+\epsilon)^2-\pi(\rho-\epsilon)^2} = \frac{1}{2\pi\rho} \frac{\partial f_\rho}{\partial \rho}$$ This leads to $$\hat{g}_\rho = \frac{1}{|k|\rho}\frac{\partial}{\partial\rho}\left[\rho J_1(|k|\rho)\right] = J_0(|k|\rho)$$ Plug these into $(*1)$ and evaluate in polar coordinates for $k$ ($s = |k|$ here), we get

$$P = \frac{1}{(2\pi)^2}\int_0^{2\pi}\int_0^\infty \frac{2\pi}{s}J_1(s) J_0(s)^n s dsd\theta = \int_0^\infty J_1(s)J_0(s)^n ds$$ Notice $J_0(s)' = -J_1(s)$, this leads to

$$P = \frac1{n+1} \left[ -J_0(s)^{n+1} \right]_0^\infty = \frac{1}{n+1}J_0(0)^{n+1} = \frac1{n+1} $$

Alternate approach using complex analysis and DCT (dominated convergence theorem).

Identify points in $\mathbb{R}^2$ with complex plane $\mathbb{C}$. Let $D = D_1$ and $C = \partial D_1$ be the unit disk and circle.

For $\zeta \in \mathbb{C}$ with $|z| \ne 1$, Cauchy integral formula tell us $$\frac1{2\pi i}\int_{C}\frac{dz}{z - \zeta} = \begin{cases}1, & |\zeta| < 1\\ 0, & |\zeta| > 0\end{cases}\tag{*2}$$

Let $\zeta_1,\cdots,\zeta_n \in C$ be the complex numbers corresponding to the $n$ random movement of unit distances. Ignoring the issue about what happens when $z - \sum_{k=1}^n\zeta_k \sim 0$, the probability we seek can be expressed as an iterated contour integral over $(n+1)$ copies of $C$.

$$P = \frac{1}{(2\pi i)^{n+1}} \int_{C^{n+1}} \frac{dz}{z - \sum_{k=1}^n\zeta_k}\prod_{k=1}^n\frac{d\zeta_k}{\zeta_k}$$ By a change of variable $z \to -\zeta_0$, we get $$P = \frac{1}{(2\pi i)^{n+1}}\int_{C^{n+1}} \frac{\zeta_0}{\sum_{k=0}^n\zeta_k}\prod_{k=0}^n\frac{d\zeta_k}{\zeta_k}$$ Exchange variable $\zeta_0$ for other $\zeta_k$ and sum, we obtain:

$$\require{cancel} (n+1)P = \frac{1}{(2\pi i)^{n+1}}\int_{C^{n+1}} \color{red}{\cancelto{1}{\color{gray}{\frac{\sum_{k=0}^n \zeta_k}{\sum_{k=0}^n \zeta_k}}}} \prod_{k=0}^n\frac{d\zeta_k}{\zeta_k} = 1$$

On the surface, this implies $P = \frac1{n+1}$. Unfortunately, the integrand $\frac{1}{z - \sum_{k=0}^n \zeta_k}$ is singular when its denominator vanishes. It is not obvious how to control contribution there and justify various operation needed to make above evaluation works.

To resolve this, look back at $(*2)$ again, the expression there is sort of an integral representation of $f = f_1 = {\rm 1}_D$. To regulate its singularity, take any small $\epsilon > 0$ and consider following function:

$$\tilde{f}_\epsilon(\zeta) = \frac{1}{2\pi i}\int_C \frac{\bar{z}-\bar{\zeta}}{|z-\zeta|^2 + \epsilon^2} dz\tag{*3}$$

For any regular enough domain $\Omega \subset \mathbb{C}$, the complex plane version of Stokes' theorem tells us

$$\begin{align}\frac{1}{2\pi i}\int_{\partial \Omega} \frac{\bar{z}-\bar{\zeta}}{|z-\zeta|^2 + \epsilon^2} dz &= \frac{1}{2\pi i}\int_{\Omega} d\left(\frac{\bar{z}-\bar{\zeta}}{(z-\zeta)(\bar{z}-\bar{\zeta}) + \epsilon^2}\right) \wedge dz\\ &= \frac{1}{\pi}\int_{\Omega} \frac{\epsilon^2}{\left(|z-\zeta|^2 + \epsilon^2\right)^2} \frac{d\bar{z} \wedge dz}{2i}\\ &= \int_{\Omega}\delta_\epsilon(x'-x,y'-y) dxdy \end{align} $$ where $z = x + iy, \zeta = x'+ iy'$ and $$\delta_\epsilon(u,v) = \frac{1}{\pi} \frac{\epsilon^2}{(u^2+v^2+\epsilon^2)^2}$$ is a non-negative function with unit integral and its "weight" is concentrated around a circle of radius $O(\epsilon)$. In terms of $\delta_\epsilon$, $\tilde{f}_\epsilon = f * \delta_{\epsilon}$ is the result of convoluting ${\rm 1}_D$ with $\delta_\epsilon$. $\tilde{f}_\epsilon(\zeta)$ approximates $f(\zeta)$ well unless $\zeta$ is within a distance $O(\epsilon)$ from $C$.

Notice for any $\zeta \notin C$, $\tilde{f}_\epsilon(\zeta)$ converges pointwise to $f(\zeta)$ as $\epsilon \to 0+$.
For $\epsilon \in (0,1)$, we also have a bound: $$|\tilde{f}_\epsilon(\zeta)| \le \begin{cases} 1, &|\zeta| \le 2\\ \frac{1}{(|\zeta|-1)^4}, &|\zeta| > 2 \end{cases}$$ Since rightmost expression is Lebesgue integrable, we can use DCT to conclude

$$P = \int_{C^n} f\left(\sum_{k=1}^n \zeta_k\right) \prod_{k=1}^n \frac{d\zeta_k}{(2\pi i)\zeta_k} = { \lim_{\epsilon\to 0+}\int_{C^n} \tilde{f}_\epsilon\left(\sum_{k=1}^n \zeta_k\right) \prod_{k=1}^n \frac{d\zeta_k}{(2 \pi i)\zeta_k} } $$ Plug integral representation $(*3)$ into rightmost expression. Perform a change of variable $z \to -\zeta_0$ again, we find

$$P = \lim_{\epsilon\to 0+}P_\epsilon \quad\text{ where }\quad P_\epsilon = \int_{C^{n+1}} \frac{\zeta_0\sum_{k=0}^n \bar{\zeta}_k}{\left|\sum_{k=0}^n \zeta_k\right|^2+\epsilon^2}\prod_{k=0}^n\frac{d\zeta_k}{(2\pi i)\zeta_k}$$ Doing the same trick of exchanging $\zeta_0$ with other $\zeta_k$ and sum, we obtain:

$$(n+1)P_\epsilon = 1 - \int_{C^{n+1}} \frac{\epsilon^2}{\left|\sum_{k=0}^n \zeta_k\right|^2 + \epsilon^2} \prod_{k=0}^n \frac{d\zeta_k}{(2\pi i)\zeta_k}$$

Parametrize all those $\zeta_k$ as $e^{2\pi \theta_k i}$ for $\theta_k \in [0,1]$, we can rewrite above to the form:

$$P_\epsilon = \frac{1}{n+1}\left(1 - \int_{[0,1]^{n+1}} h_\epsilon(\theta) \prod_{k=0}^n d\theta_k\right) $$ where $\;\;\quad h_\epsilon(\theta) = \frac{\epsilon^2}{|\zeta(\theta)|^2 + \epsilon^2}\;\;$ and $\;\;\zeta(\theta) = \sum_{k=0}^n e^{2\pi \theta_k i}$.

For any $S \subset \mathbb{R}^d$, let $\mu_d(S)$ be its $d$-dim measure. Let $X = \{ \theta \in [0,1]^{n+1} : \zeta(\theta) = 0 \}$.

Since $[0,1]^{n+1}$ is finite, the family of function $h_\epsilon$ is bounded by the Lebesgue integrable function $1$. Furthermore, for any fixed $\theta$, $h_\epsilon(\theta)$ converges pointwise to ${\rm 1}_X$ as $\epsilon \to 0+$. By DCT again, we find

$$P = \lim_{\epsilon\to 0+} P_\epsilon = \frac{1}{n+1} \left(1 - \int_{X} \prod_{k=0}^n d\theta_k\right) = \frac{1}{n+1}\left(1 - \mu_{n+1}(X)\right) $$

We claim that $\mu_{n+1}(X) = 0$. If that is the case, $P = \frac1{n+1}$ follows immediately.

To see $\mu_{n+1}(X) = 0$, let $\theta' = (\theta_1,\ldots,\theta_n)$ and $\zeta'(\theta') = \sum\limits_{k=1}^n e^{2\pi \theta_k i}$.

Decompose $\zeta(\theta)$ as $e^{2\pi \theta_0 i} + \zeta'(\theta')$. Since $\zeta'(\theta')$ is continuous over compact $[0,1]^n$, it is uniformly continuous there.

For any $\eta \in (0,\frac12)$, let $\eta' = \sin(\pi \eta)$. One can subdivide $[0,1]^n$ into a finite disjoint union of cells $[0,1]^n = \bigcup_i K_i$ with each $\zeta'(K_i)$ is contained in some disk of radius $\eta'$ in $\mathbb{C}$.

For those $K_i$ where $[0,1] \times K_i$ intersects with $X$, one can always find an $I_i \subset [0,1]$ ($I_i$ is either an interval or union of two intervals) with $\mu_1(I_i) \le \eta$ to make $X \cap ([0,1] \times K_i) \subset I_i \times K_i$. For other $K_i$, just set $I_i$ to $\emptyset$.

In this way, we have $X \subset \bigcup_i I_i \times K_i$ and hence $$\mu_{n+1}(X) \le \sum_i \mu_1(I_i) \mu_{n}(K_i) \le \eta\sum_i\mu_n(K_i) = \eta$$

Since $\eta$ can be arbitrary small, we get $\mu_{n+1}(X) = 0$.

Sarvesh Ravichandran Iyer · Answer 2 · 2023-01-11T05:06:04.213

On the history of this problem

The problem mentioned in this question is , in general, known as the problem of random flights. It was first proposed by Karl Pearson (of the Pearson correlation coefficient fame) in $1905$. His question was not proposed with the same generality that future mathematicians assumed. However, he was able to derive, for instance, the expected distance from origin and its variance after $n$ steps.

Various forms of this question, which chose to tweak things slightly (or not at all) were asked in other disciplines of science including biology and physics). Note that the dimension of these walks was restricted to $2$ , because the distribution of a random angle is far easier to decompose mathematically than the distribution of a uniformly random point in a $d$-dimensional sphere, $d>2$.

A very general version of the problem was solved using something called the "Weber's discontinuous factor" (it actually goes back to the days of Dirichlet and Markov, but is known by Weber's name). It provided explicitly the distribution of the point after $n$ steps, without using the theory of characteristic functions (indeed, using the idea of characteristic functions, one can easily decompose the problem into $n$ iid steps, each having a uniform distribution on the circle, and the rest is an easy exercise). This solution is due to Kluyver, but uses tools from Watson's famous, but forgotten book.

A different proof given by Polya in 1919 used characteristic functions. On the other hand, Von Smoluchowski solved related problems using the same technique as well.

Following his developments, Watson himself generalized the problem to $K$ dimensions ($k>2$ arbitrary), and found an explicit formula for the density, which also uses the method of discontinuous factors.

Since the above formulas are somewhat difficult, it made more sense to look at the asymptotic decay of the probability density function for the distance after $n$ rounds. These were also computed by Watson (the asymptotic in dimension $2$ was computed by Rayleigh).

Finally, a solution entirely from undergraduate analysis, not using tools beyond basic integration and geometry (and avoiding special functions, for example), was found in 2019 by Pedro Barata de Tovar Sà and Alexander Kovačec. A solution purely for dimension $3$ which uses transformations exclusive to that dimension was also mentioned, that of Treolar.

With that, let us state the problem we are going to give a general solution to : a vast, vast generalization of the problem above.

Let , in two dimension, $\theta_1,\theta_2,\theta_3,\ldots$ be a sequence of independent identically distributed random variables which are taking uniform values in $[0,2\pi)$. Let $l_1,l_2,l_3,\ldots$ be an infinite fixed sequence of positive real numbers, to be seen as jump lengths. Let $S_0=0$, and let $S_{n+1} = S_n + l_{n+1}e_{\theta_1}$, where $e_{\theta_1}$ is the unit vector which has angle $\theta_1$ with the positive $x$ axis. Fix an $M>0, R>0$. What is the probability $$ \mathbb P(|S_m| \geq R) $$ at least somewhat explicitly?

That is, we are allowing the step lengths to vary , but the length is fixed at each step. For example, you can probably take a step of size $l_1=1$, then a step of size $l_2=2$, then a step of size $l_3=3/2$, and so on.

Obviously, if all the $l_i=1$ and $R=1$ then we get back to our problem (where $M$ is allowed to vary).

The "obvious" way to attack it

Well, what's the "obvious" way to attack it? One can easily write the answer as a multiple integral over various choices of angles at each stage. The limits of the angles will be conditioned on the walk not stepping out of a ball of radius $R$.

Let's see. All you need to do is ensure that $|S_M| \leq R$ at the end : it doesn't matter if , for some $n<M$, it happens that $|S_n| > R$ (although insisting that this be the case is also an interesting question that can be attempted using a completely different technique).

Therefore, let the angles $\theta_1,\theta_2,\ldots,\theta_{m-1}$ vary freely between $0$ and $2\pi$ (the latter not included). All you need to ensure is that $\theta_m$ is such that $|S_M| \leq R$. Knowing that the free integral $$\int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_{0}^{2\pi} d\theta_{M} \ldots d\theta_1 = (2\pi)^M$$ by separation of variables, we just get $$ \mathbb P(|S_M| \leq R) = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_{0}^{2\pi} 1_{|S_M| \leq R} d\theta_{M} \ldots d\theta_1 $$

This, this isn't very satisfactory , I'll assume.

We must find a way of somehow playing around the indicator. Enter the main protagonists of this story, the...

Bessel functions and the Weber discontinuous factor

Whether in the ODE or asymptotic analysis or probability theory, Bessel functions have multiple uses. While their definition is somewhat weird, it suffices to say that their origin comes when one attempts to solve a second order ODE using the method of power series.

The definition of the Bessel function of order $n$ is $$ J_n(z) = \sum_{m=0}^{\infty} \frac{(-1)^m \left(\frac{x}{2}\right)^{2m+n}}{m!(m+n)!} $$ (this can also be defined for $n$ not an integer, but we'll not worry about that).

There are multiple interpretations of the Bessel function, but they are most well known for arising as solutions of the differential equation $$ x^2\frac{d^2y}{dx^2}+ x\frac{dy}{dx} + y(x^2-n^2) = 0 $$

The Bessel functions $J_n(x)$ are defined for $x \in \mathbb R$ and decay fast enough to allow for multiple integral identities to surface. Some other places where they resurface can be found in Fourier analysis, probability theory and the likes.

However, the most important relations are those that are referred to as integral (some special ones are called orthogonality) relations. These results, while somewhat difficult to derive (the simplest proofs require contour integration from complex analysis), are outstanding.

What these integral relations tell us, in particular, is that there's a way of "denesting" the integrals given above, by replacing the indicator by an integral involving Bessel functions, and then using a "cascading" argument to kill off all integrals except one, that remains at the end.

Note that the final answer will definitely be in terms of Bessel functions : however, because the asymptotic nature of Bessel functions (and their products) is well known, one can use this to derive asymptotic estimates.

We state, without proof, the Weber discontinuous factor theorem : given $a\neq b$ positive real numbers, $$ \int_{0}^{\infty} aJ_1(at)J_0(bt)dt = 1_{b < a} $$

What if $b=a$? It turns out we don't need to analyze this. Going back to the original problem, we are trying to check this with $a = R$ and $b = S_M$. It turns out that if $M>1$ then $\mathbb P(|S_M| = R) = 0$ for any $R$ (not ). This follows using an induction argument from the base case $M=2$. Thus we don't need to worry about $|S_M| = R$.

You can see why this is called a "discontinuity" factor theorem : somehow, we are introducing a new integral to handle this discontinuous indicator very nicely.

So this allows us to write $$ \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_{0}^{2\pi} 1_{|S_M| \leq R} d\theta_{M} \ldots d\theta_1 \\ = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_0^{2\pi} \int_{0}^{\infty} RJ_1(Rt)J_0(|S_M|t)dt d\theta_{M} \ldots d\theta_1 $$

That's great. Now, we must find a way of doing the cascade. For this, note that by the cosine law for triangles $$ S_M= S_{M-1} + l_{M}e_{\theta_M} \implies |S_{M}|^2 = |S_{M-1}|^2 + l_M^2 - 2 |S_{M-1}|l_M\cos \theta_{M} $$ We exchange the integral $\int_0^\infty dt$ with the last $\int_0^{2\pi} d\theta_M$ (don't worry about the technicalities) and pull some things outside the innermost integral to write $$ \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_0^{2\pi} \int_{0}^{\infty} RJ_1(Rt)J_0(|S_M|t)dt d\theta_{M} \ldots d\theta_1 \\ = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M-1$ times}} \int_{0}^{\infty}RJ_1(Rt) \int_0^{2\pi} J_0(|S_{M-1}|^2t + l_M^2t - 2|S_{M-1}|l_Mt\cos\theta_M) d\theta_{M} dt \ldots d\theta_1 $$

We use an amazing radial property of the Bessel functions, called the Neumann property , which states that if $a,b$ are positive reals then $$ \int_{0}^{2\pi} J_0(\sqrt{a^2+b^2 - 2ab \cos \theta}) d \theta = 2 \pi J_0(a)J_0(b) $$

to write just the innermost part more carefully $$ \int_{0}^{\infty}RJ_1(Rt) \int_0^{2\pi} J_0(|S_{M-1}|^2t + l_M^2t - 2|S_{M-1}|l_Mt\cos\theta_M) d\theta_{M} dt \\ = \int_{0}^{\infty}2\pi RJ_1(Rt)J_0(|S_{M-1}|t)J_0(l_Mt)dt $$

But this is amazing, because we've removed $\theta_M$ from the equation! Now, if you look back, we just have $$ \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M-1$ times}} \int_{0}^{\infty}RJ_1(Rt) \int_0^{2\pi} J_0(|S_{M-1}|^2t + l_M^2t - 2|S_{M-1}|l_Mt\cos\theta_M) d\theta_{M} dt \ldots d\theta_1 \\ = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M-1$ times}} \int_{0}^{\infty}(2\pi)RJ_1(Rt) J_0(|S_{M-1}|t)J_0(l_Mt) dt \ldots d\theta_1 $$

However, we can just repeat everything we did before, simply because the new terms created in the previous step don't have any dependence on $\theta_{M-1}$. They can just come out of that integral, and then we can repeat the step using Neumann's property. The $2\pi$ terms which are created keep cancelling out with the $\frac{1}{(2\pi)^M}$ which is outside! At the end, I leave you to check the following formula $$ \boxed{\mathbb P(|S_M| \leq R) = \int_0^{\infty} RJ_1(Rt)\prod_{m=1}^{M}J_0(l_mt) dt} $$

which is a far, far better formula, and can be considered as a starting point for asymptotic theory, for instance.

An unbelievable probabilistic identity

We will now derive what is, frankly, an absolutely ridiculous identity which links various such probabilities together. Let's create a notation : when we write $\mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M)$, we mean the probability that $|S_M| \leq R$ when the first $M$ lengths are $l_1,l_2,\ldots,l_M$ in that order.

Let's write down the formula $$ \mathbb P(|S_M| \leq R ; l_1,\ldots,l_M) = \int_0^{\infty} RJ_1(Rt)J_0(l_1t) \ldots J_0(l_Mt) dt $$

from last time. We use the relation that $$ \frac{d}{dz}J_0(z) = -J_1(z) $$ at this point. Now, let's integrate by parts! Start with $$ \int_0^{\infty} RJ_1(Rt)\prod_{i=1}^{M} J_0(l_{i}t) dt $$ Then use integration by parts with the product as one part as the rest as the other to get $$ \int_0^{\infty} RJ_1(Rt)\prod_{i=1}^{M} J_0(l_{i}t) dt = \left[-J_0(Rt)\prod_{i=1}^{M} J_0(l_{i}t)\right]_0^{\infty} - \int_0^{\infty} J_0(Rt)\sum_{i=1}^{M}J_i(l_it)\prod_{j=1,j \neq i}^{M} J_0(l_jt) dt $$ This is great. Now, let's switch $R$ with say $l_1$, and consider $$ \mathbb P(|S_M| \leq l_1 ; R,l_2,\ldots,l_M) = \int_0^{\infty} l_1J_1(l_1t)J_0(Rt)\prod_{i=2}^{M} J_0(l_{i}t) dt $$ you can integrate by parts like above and get a similar identity. However, what happens when you actually look at the quantity $$ \mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_1 ;R,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_2 ; l_1,R,\ldots,l_M) + \ldots + \mathbb P(|S_M| \leq l_M ; l_1,l_2,\ldots,R) $$ is that, by adding up the right hand sides (let's call $l_{M+1} = R$ right now to make notation easier) of all the identities created after we replace $R$ with each of $l_1,l_2,\ldots l_M$ and integrate by parts like above, we get that the sum of the probabilities above equals $$ \left[-(M+1)\prod_{i=1}^{M+1}J_0(l_it)\right]_0^{\infty} + M\int_{0}^{\infty} \sum_{i=1}^{M+1} J_i(l_it)\prod_{j=1, j \neq i}^{M+1} J_0(l_it) dt = \left[\prod_{i=1}^{M+1}J_0(l_it)\right]_0^{\infty} =1 $$

Thus, we obtain the rather weird relation that for any $R,l_1,\ldots,l_m$, $$ \mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_1 ;R,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_2 ; l_1,R,\ldots,l_M) + \ldots + \mathbb P(|S_M| \leq l_M ; l_1,l_2,\ldots,R) = 1 $$

I don't quite know how to combinatorially interpret or prove this, but it roughly says that if you replace $R$ by each of $l_1,l_2,\ldots,l_m$, then those probabilities add up to $1$. Note that the events aren't disjoint or anything, so it's not clear why they correlate so well!

However, once you put $R = l_1=\ldots=l_n$, it's obvious that $$ (M+1)\mathbb P(|S_M| \leq R ; R,R,R,\ldots,R) = 1 \implies \mathbb P(|S_M| \leq R ; R,R,R,\ldots,R) = \frac{1}{M+1} $$ as desired.

On the multi-dimensional question

Let's now ask the same question in multiple dimensions.

A random walker starts at the point $(0,0,\ldots,0) \in \mathbb R^K$. Given positive distances $l_1,l_2,\ldots$, at time $n>0$ he picks a unit vector $e_m$ uniformly at random from the unit sphere $\{v \in \mathbb R^K : |v| = 1\}$, and his displacement equals $l_me_m$. If we call $S_m$ the sequence of points he gets to, then what is the probability that $|S_M| \leq R$ for some $M,R$?

The calculations are just a little more tedious, but basically : you consider, instead of just $\theta$ varying from $0$ to $2\pi$, a variant of the "solid" angle. That along with the formula for the volume of an $n$-sphere, gives us a "preliminary" starting point $$ \mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) \\= \left(\frac{\Gamma(K/2)}{\Gamma((K-1)/2)\Gamma(1/2)}\right)\int_0^{2\pi}\ldots_{\text{$M$ times}} \int_0^{2\pi} 1_{|S_M| \leq R}\prod_{m=1}^{M-1}(\sin \theta_M)^{K-2}d\theta_M\ldots d\theta_1 $$

We now need an appropriate discontinuous factor, and an appropriate "Neumann property" to help us through. Thankfully, both are available and involve higher order Bessel functions, and can be found in the resources I will attach at the end. We will get $$ \mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) \\= R\Gamma(K/2)^{M-1}\int_{0}^{\infty}\left(\frac{Rt}{2}\right)^{(k-2)/2}J_{k/2}\prod_{n=1}^{M}\frac{J_{(k-2)/2}(Rt)(l_nt)2^{(k-2)/2}}{(l_nt)^{(k-2)/2}} dt $$

This formula simplifies to the usual one when $k=2$. Using Laplace's method, Weber's exponential integral and hypergeometric functions, Watson proved that if all $l_i=L$, if $R$ is fixed, and we try to study the asymptotic nature of $\mathbb P(|S_N| \leq R ; L,L,\ldots,L)$ in dimension $K$, then $$ \mathbb P(|S_N| \leq R ; L,L,\ldots,L) \sim \frac{1}{\Gamma\left(\frac{K+2}{2}\right)}\left(\frac{R^2K}{2NL^2}\right)^\frac{k}{2} \ _1F_1\left(\frac{K}{2},\frac{K+2}{2};-\frac{R^2K}{2NL^2}\right) $$ where $_1F_1$ is a generalized hypergeometric function. One can use the definition of such a function to see what happens if $K=2$.

Briefly, on the methods of Pedro Barata de Tovar Sà and Alexander Kovačec

The method used by the above authors is interesting. They restrict themselves to all the jump lengths being equal to $1$ (and to dimension $3$), but allow $R$ to vary. Suppose we confine ourselves to just $n=2$ i.e. we want to jump just twice from the origin and see whether we lie within $R$ distance of the origin or not.

First, the authors prove that if $S_1,S_2$ are the first and second jump point above then $$ \mathbb P(|S_2| \leq R) = \frac{1}{2}\left|\{z \in [-1,1] : \sqrt{2+2z} \leq R\}\right| $$

Where $|A|$ is the volume of a set $A$ (Lebesgue measure, if you like)

The proof uses nothing more than a simple volume invariance principle in $2$ dimensions, and the cosine law. Once they do this, they attempt to illustrate a recursive procedure by taking another point $S_3$ as well, and show that $$ \mathbb P(|S_3| \leq R) = \frac{1}{4}\left|\{(z_2,z_3) \in [-1,1]^2 : \sqrt{3+2z_2+2z_3\sqrt{2+2z_2}} \leq R\}\right| $$

They then show that, by induction, $\mathbb P(|S_N|\leq R)$ can be obtained using a recursive procedure, as an integral of some function involving nested square roots. At this point, the problem looks intractable.

However, using a natural, but tractable diffeomorphism between the set whose Lebesgue measure is wanted above and a polytope of the same dimension, they reduce the problem to an integral over a polytope. The integral over the polytope is found to be an easier problem.

However, perhaps the best part of their exposition is that get an explicit formula not involving the Bessel function in the case $d=3$. That is, the formula $\mathbb P(|S_N| \leq R)$ , in their case, is actually a far more explicit function of $N$ and $R$.

Their formula , in dimensions $3$ reads (note : I took the complement, so just subtract this from $1$) $$ \boxed{\mathbb P(|S_N| > R) \\= \frac{2^{-N-1}(-1)^N}{N!}\sum_{\nu=0}^{\left\lceil\frac{N-2-R}{2}\right\rceil}(-1)^{\nu}\binom{N}{\nu}\left(-N+2\nu+R\right)^{N-1}(N-2\nu+(N-1)R)} $$ where the sum is empty if $\frac{N-2-R}{2} < 0$.

This solution, along with Treolar's which also uses similar ideas involving the Lebesgue measure and polytope integrals, shows that the problem is accessible to everybody, without really needing to use special functions. While the authors do use some properties of three-dimensional behavior, I hold the belief that their results may be sparklingly converted to convenient results in higher dimensions. It can be phrased as a daunting, but tractable undergraduate exercise.

References

[1] A treatise on the theory of Bessel functions by G.N.Watson is a forgotten gem of a book that goes into nearly everything you'd want to know about Bessel functions, from origins to applications to trivia. Note that this book's chapter on "infinite integrals" contains rigorous proofs of all the theorems we cited and used in the first section.

[2] On the problem of random flights by J.Dutka contains details on the origin of the problem, and on some related problems as well, including the methods of Lord Rayleigh and Von Smoluchowski.

[3] A local probability problem by Prof. Justin Kluyver, where the proof presented was given, along with the amazing probabilistic relation between different lengths, which I still cannot wrap my head around.

[4] Generalized Hypergeometric functions For reference to the asymptotics. Try the case $K=2$ and see what you get.

[5] On the probability to be after $n$ random jumps of unit lengthin space within a distance of radius $r$ from the start : the problem of random flights is the outstanding contribution of Pedro and Alexander, along with some notes on Treolar's proof. Check out the references of this paper to get access to the papers of Treolar and Rayleigh.

@achillehui Thank you very much, I have just expanded on the answer after waking up today. — Sarvesh Ravichandran Iyer, Jan 11 '23 at 05:06