On the history of this problem
The problem mentioned in this question is , in general, known as the problem of random flights. It was first proposed by Karl Pearson (of the Pearson correlation coefficient fame) in $1905$. His question was not proposed with the same generality that future mathematicians assumed. However, he was able to derive, for instance, the expected distance from origin and its variance after $n$ steps.
Various forms of this question, which chose to tweak things slightly (or not at all) were asked in other disciplines of science including biology and physics). Note that the dimension of these walks was restricted to $2$ , because the distribution of a random angle is far easier to decompose mathematically than the distribution of a uniformly random point in a $d$-dimensional sphere, $d>2$.
A very general version of the problem was solved using something called the "Weber's discontinuous factor" (it actually goes back to the days of Dirichlet and Markov, but is known by Weber's name). It provided explicitly the distribution of the point after $n$ steps, without using the theory of characteristic functions (indeed, using the idea of characteristic functions, one can easily decompose the problem into $n$ iid steps, each having a uniform distribution on the circle, and the rest is an easy exercise). This solution is due to Kluyver, but uses tools from Watson's famous, but forgotten book.
A different proof given by Polya in 1919 used characteristic functions. On the other hand, Von Smoluchowski solved related problems using the same technique as well.
Following his developments, Watson himself generalized the problem to $K$ dimensions ($k>2$ arbitrary), and found an explicit formula for the density, which also uses the method of discontinuous factors.
Since the above formulas are somewhat difficult, it made more sense to look at the asymptotic decay of the probability density function for the distance after $n$ rounds. These were also computed by Watson (the asymptotic in dimension $2$ was computed by Rayleigh).
Finally, a solution entirely from undergraduate analysis, not using tools beyond basic integration and geometry (and avoiding special functions, for example), was found in 2019 by Pedro Barata de Tovar Sà and Alexander Kovačec. A solution purely for dimension $3$ which uses transformations exclusive to that dimension was also mentioned, that of Treolar.
With that, let us state the problem we are going to give a general solution to : a vast, vast generalization of the problem above.
Let , in two dimension, $\theta_1,\theta_2,\theta_3,\ldots$ be a sequence of independent identically distributed random variables which are taking uniform values in $[0,2\pi)$. Let $l_1,l_2,l_3,\ldots$ be an infinite fixed sequence of positive real numbers, to be seen as jump lengths. Let $S_0=0$, and let $S_{n+1} = S_n + l_{n+1}e_{\theta_1}$, where $e_{\theta_1}$ is the unit vector which has angle $\theta_1$ with the positive $x$ axis. Fix an $M>0, R>0$. What is the probability $$
\mathbb P(|S_m| \geq R)
$$
at least somewhat explicitly?
That is, we are allowing the step lengths to vary , but the length is fixed at each step. For example, you can probably take a step of size $l_1=1$, then a step of size $l_2=2$, then a step of size $l_3=3/2$, and so on.
Obviously, if all the $l_i=1$ and $R=1$ then we get back to our problem (where $M$ is allowed to vary).
The "obvious" way to attack it
Well, what's the "obvious" way to attack it? One can easily write the answer as a multiple integral over various choices of angles at each stage. The limits of the angles will be conditioned on the walk not stepping out of a ball of radius $R$.
Let's see. All you need to do is ensure that $|S_M| \leq R$ at the end : it doesn't matter if , for some $n<M$, it happens that $|S_n| > R$ (although insisting that this be the case is also an interesting question that can be attempted using a completely different technique).
Therefore, let the angles $\theta_1,\theta_2,\ldots,\theta_{m-1}$ vary freely between $0$ and $2\pi$ (the latter not included). All you need to ensure is that $\theta_m$ is such that $|S_M| \leq R$. Knowing that the free integral $$\int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_{0}^{2\pi} d\theta_{M} \ldots d\theta_1 = (2\pi)^M$$
by separation of variables, we just get $$
\mathbb P(|S_M| \leq R) = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_{0}^{2\pi} 1_{|S_M| \leq R} d\theta_{M} \ldots d\theta_1
$$
This, this isn't very satisfactory , I'll assume.
We must find a way of somehow playing around the indicator. Enter the main protagonists of this story, the...
Bessel functions and the Weber discontinuous factor
Whether in the ODE or asymptotic analysis or probability theory, Bessel functions have multiple uses. While their definition is somewhat weird, it suffices to say that their origin comes when one attempts to solve a second order ODE using the method of power series.
The definition of the Bessel function of order $n$ is $$
J_n(z) = \sum_{m=0}^{\infty} \frac{(-1)^m \left(\frac{x}{2}\right)^{2m+n}}{m!(m+n)!}
$$
(this can also be defined for $n$ not an integer, but we'll not worry about that).
There are multiple interpretations of the Bessel function, but they are most well known for arising as solutions of the differential equation $$
x^2\frac{d^2y}{dx^2}+ x\frac{dy}{dx} + y(x^2-n^2) = 0
$$
The Bessel functions $J_n(x)$ are defined for $x \in \mathbb R$ and decay fast enough to allow for multiple integral identities to surface. Some other places where they resurface can be found in Fourier analysis, probability theory and the likes.
However, the most important relations are those that are referred to as integral (some special ones are called orthogonality) relations. These results, while somewhat difficult to derive (the simplest proofs require contour integration from complex analysis), are outstanding.
What these integral relations tell us, in particular, is that there's a way of "denesting" the integrals given above, by replacing the indicator by an integral involving Bessel functions, and then using a "cascading" argument to kill off all integrals except one, that remains at the end.
Note that the final answer will definitely be in terms of Bessel functions : however, because the asymptotic nature of Bessel functions (and their products) is well known, one can use this to derive asymptotic estimates.
We state, without proof, the Weber discontinuous factor theorem : given $a\neq b$ positive real numbers, $$
\int_{0}^{\infty} aJ_1(at)J_0(bt)dt = 1_{b < a}
$$
What if $b=a$? It turns out we don't need to analyze this. Going back to the original problem, we are trying to check this with $a = R$ and $b = S_M$. It turns out that if $M>1$ then $\mathbb P(|S_M| = R) = 0$ for any $R$ (not ). This follows using an induction argument from the base case $M=2$. Thus we don't need to worry about $|S_M| = R$.
You can see why this is called a "discontinuity" factor theorem : somehow, we are introducing a new integral to handle this discontinuous indicator very nicely.
So this allows us to write $$
\frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_{0}^{2\pi} 1_{|S_M| \leq R} d\theta_{M} \ldots d\theta_1 \\ = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_0^{2\pi} \int_{0}^{\infty} RJ_1(Rt)J_0(|S_M|t)dt d\theta_{M} \ldots d\theta_1
$$
That's great. Now, we must find a way of doing the cascade. For this, note that by the cosine law for triangles $$
S_M= S_{M-1} + l_{M}e_{\theta_M} \implies |S_{M}|^2 = |S_{M-1}|^2 + l_M^2 - 2 |S_{M-1}|l_M\cos \theta_{M}
$$
We exchange the integral $\int_0^\infty dt$ with the last $\int_0^{2\pi} d\theta_M$ (don't worry about the technicalities) and pull some things outside the innermost integral to write $$
\frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M$ times}} \int_0^{2\pi} \int_{0}^{\infty} RJ_1(Rt)J_0(|S_M|t)dt d\theta_{M} \ldots d\theta_1 \\ = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M-1$ times}} \int_{0}^{\infty}RJ_1(Rt) \int_0^{2\pi} J_0(|S_{M-1}|^2t + l_M^2t - 2|S_{M-1}|l_Mt\cos\theta_M) d\theta_{M} dt \ldots d\theta_1
$$
We use an amazing radial property of the Bessel functions, called the Neumann property , which states that if $a,b$ are positive reals then $$
\int_{0}^{2\pi} J_0(\sqrt{a^2+b^2 - 2ab \cos \theta}) d \theta = 2 \pi J_0(a)J_0(b)
$$
to write just the innermost part more carefully $$
\int_{0}^{\infty}RJ_1(Rt) \int_0^{2\pi} J_0(|S_{M-1}|^2t + l_M^2t - 2|S_{M-1}|l_Mt\cos\theta_M) d\theta_{M} dt \\ = \int_{0}^{\infty}2\pi RJ_1(Rt)J_0(|S_{M-1}|t)J_0(l_Mt)dt
$$
But this is amazing, because we've removed $\theta_M$ from the equation! Now, if you look back, we just have $$
\frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M-1$ times}} \int_{0}^{\infty}RJ_1(Rt) \int_0^{2\pi} J_0(|S_{M-1}|^2t + l_M^2t - 2|S_{M-1}|l_Mt\cos\theta_M) d\theta_{M} dt \ldots d\theta_1 \\ = \frac{1}{(2\pi)^M} \int_{0}^{2 \pi} \int_{0}^{2\pi} \ldots_{\text{$M-1$ times}} \int_{0}^{\infty}(2\pi)RJ_1(Rt) J_0(|S_{M-1}|t)J_0(l_Mt) dt \ldots d\theta_1
$$
However, we can just repeat everything we did before, simply because the new terms created in the previous step don't have any dependence on $\theta_{M-1}$. They can just come out of that integral, and then we can repeat the step using Neumann's property. The $2\pi$ terms which are created keep cancelling out with the $\frac{1}{(2\pi)^M}$ which is outside! At the end, I leave you to check the following formula $$
\boxed{\mathbb P(|S_M| \leq R) = \int_0^{\infty} RJ_1(Rt)\prod_{m=1}^{M}J_0(l_mt) dt}
$$
which is a far, far better formula, and can be considered as a starting point for asymptotic theory, for instance.
An unbelievable probabilistic identity
We will now derive what is, frankly, an absolutely ridiculous identity which links various such probabilities together. Let's create a notation : when we write $\mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M)$, we mean the probability that $|S_M| \leq R$ when the first $M$ lengths are $l_1,l_2,\ldots,l_M$ in that order.
Let's write down the formula $$
\mathbb P(|S_M| \leq R ; l_1,\ldots,l_M) = \int_0^{\infty} RJ_1(Rt)J_0(l_1t) \ldots J_0(l_Mt) dt
$$
from last time. We use the relation that $$
\frac{d}{dz}J_0(z) = -J_1(z)
$$
at this point. Now, let's integrate by parts! Start with $$
\int_0^{\infty} RJ_1(Rt)\prod_{i=1}^{M} J_0(l_{i}t) dt
$$
Then use integration by parts with the product as one part as the rest as the other to get $$
\int_0^{\infty} RJ_1(Rt)\prod_{i=1}^{M} J_0(l_{i}t) dt = \left[-J_0(Rt)\prod_{i=1}^{M} J_0(l_{i}t)\right]_0^{\infty} - \int_0^{\infty} J_0(Rt)\sum_{i=1}^{M}J_i(l_it)\prod_{j=1,j \neq i}^{M} J_0(l_jt) dt
$$
This is great. Now, let's switch $R$ with say $l_1$, and consider $$
\mathbb P(|S_M| \leq l_1 ; R,l_2,\ldots,l_M) = \int_0^{\infty} l_1J_1(l_1t)J_0(Rt)\prod_{i=2}^{M} J_0(l_{i}t) dt
$$
you can integrate by parts like above and get a similar identity. However, what happens when you actually look at the quantity $$
\mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_1 ;R,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_2 ; l_1,R,\ldots,l_M) + \ldots + \mathbb P(|S_M| \leq l_M ; l_1,l_2,\ldots,R)
$$
is that, by adding up the right hand sides (let's call $l_{M+1} = R$ right now to make notation easier) of all the identities created after we replace $R$ with each of $l_1,l_2,\ldots l_M$ and integrate by parts like above, we get that the sum of the probabilities above equals $$
\left[-(M+1)\prod_{i=1}^{M+1}J_0(l_it)\right]_0^{\infty} + M\int_{0}^{\infty} \sum_{i=1}^{M+1} J_i(l_it)\prod_{j=1, j \neq i}^{M+1} J_0(l_it) dt = \left[\prod_{i=1}^{M+1}J_0(l_it)\right]_0^{\infty} =1
$$
Thus, we obtain the rather weird relation that for any $R,l_1,\ldots,l_m$, $$
\mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_1 ;R,l_2,\ldots,l_M) + \mathbb P(|S_M| \leq l_2 ; l_1,R,\ldots,l_M) + \ldots + \mathbb P(|S_M| \leq l_M ; l_1,l_2,\ldots,R) = 1
$$
I don't quite know how to combinatorially interpret or prove this, but it roughly says that if you replace $R$ by each of $l_1,l_2,\ldots,l_m$, then those probabilities add up to $1$. Note that the events aren't disjoint or anything, so it's not clear why they correlate so well!
However, once you put $R = l_1=\ldots=l_n$, it's obvious that $$
(M+1)\mathbb P(|S_M| \leq R ; R,R,R,\ldots,R) = 1 \implies \mathbb P(|S_M| \leq R ; R,R,R,\ldots,R) = \frac{1}{M+1}
$$
as desired.
On the multi-dimensional question
Let's now ask the same question in multiple dimensions.
A random walker starts at the point $(0,0,\ldots,0) \in \mathbb R^K$. Given positive distances $l_1,l_2,\ldots$, at time $n>0$ he picks a unit vector $e_m$ uniformly at random from the unit sphere $\{v \in \mathbb R^K : |v| = 1\}$, and his displacement equals $l_me_m$. If we call $S_m$ the sequence of points he gets to, then what is the probability that $|S_M| \leq R$ for some $M,R$?
The calculations are just a little more tedious, but basically : you consider, instead of just $\theta$ varying from $0$ to $2\pi$, a variant of the "solid" angle. That along with the formula for the volume of an $n$-sphere, gives us a "preliminary" starting point $$
\mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) \\= \left(\frac{\Gamma(K/2)}{\Gamma((K-1)/2)\Gamma(1/2)}\right)\int_0^{2\pi}\ldots_{\text{$M$ times}} \int_0^{2\pi} 1_{|S_M| \leq R}\prod_{m=1}^{M-1}(\sin \theta_M)^{K-2}d\theta_M\ldots d\theta_1
$$
We now need an appropriate discontinuous factor, and an appropriate "Neumann property" to help us through. Thankfully, both are available and involve higher order Bessel functions, and can be found in the resources I will attach at the end. We will get
$$
\mathbb P(|S_M| \leq R ; l_1,l_2,\ldots,l_M) \\= R\Gamma(K/2)^{M-1}\int_{0}^{\infty}\left(\frac{Rt}{2}\right)^{(k-2)/2}J_{k/2}\prod_{n=1}^{M}\frac{J_{(k-2)/2}(Rt)(l_nt)2^{(k-2)/2}}{(l_nt)^{(k-2)/2}} dt
$$
This formula simplifies to the usual one when $k=2$. Using Laplace's method, Weber's exponential integral and hypergeometric functions, Watson proved that if all $l_i=L$, if $R$ is fixed, and we try to study the asymptotic nature of $\mathbb P(|S_N| \leq R ; L,L,\ldots,L)$ in dimension $K$, then
$$
\mathbb P(|S_N| \leq R ; L,L,\ldots,L) \sim \frac{1}{\Gamma\left(\frac{K+2}{2}\right)}\left(\frac{R^2K}{2NL^2}\right)^\frac{k}{2} \ _1F_1\left(\frac{K}{2},\frac{K+2}{2};-\frac{R^2K}{2NL^2}\right)
$$
where $_1F_1$ is a generalized hypergeometric function. One can use the definition of such a function to see what happens if $K=2$.
Briefly, on the methods of Pedro Barata de Tovar Sà and Alexander Kovačec
The method used by the above authors is interesting. They restrict themselves to all the jump lengths being equal to $1$ (and to dimension $3$), but allow $R$ to vary. Suppose we confine ourselves to just $n=2$ i.e. we want to jump just twice from the origin and see whether we lie within $R$ distance of the origin or not.
First, the authors prove that if $S_1,S_2$ are the first and second jump point above then $$
\mathbb P(|S_2| \leq R) = \frac{1}{2}\left|\{z \in [-1,1] : \sqrt{2+2z} \leq R\}\right|
$$
Where $|A|$ is the volume of a set $A$ (Lebesgue measure, if you like)
The proof uses nothing more than a simple volume invariance principle in $2$ dimensions, and the cosine law. Once they do this, they attempt to illustrate a recursive procedure by taking another point $S_3$ as well, and show that $$
\mathbb P(|S_3| \leq R) = \frac{1}{4}\left|\{(z_2,z_3) \in [-1,1]^2 : \sqrt{3+2z_2+2z_3\sqrt{2+2z_2}} \leq R\}\right|
$$
They then show that, by induction, $\mathbb P(|S_N|\leq R)$ can be obtained using a recursive procedure, as an integral of some function involving nested square roots. At this point, the problem looks intractable.
However, using a natural, but tractable diffeomorphism between the set whose Lebesgue measure is wanted above and a polytope of the same dimension, they reduce the problem to an integral over a polytope. The integral over the polytope is found to be an easier problem.
However, perhaps the best part of their exposition is that get an explicit formula not involving the Bessel function in the case $d=3$. That is, the formula $\mathbb P(|S_N| \leq R)$ , in their case, is actually a far more explicit function of $N$ and $R$.
Their formula , in dimensions $3$ reads (note : I took the complement, so just subtract this from $1$) $$
\boxed{\mathbb P(|S_N| > R) \\= \frac{2^{-N-1}(-1)^N}{N!}\sum_{\nu=0}^{\left\lceil\frac{N-2-R}{2}\right\rceil}(-1)^{\nu}\binom{N}{\nu}\left(-N+2\nu+R\right)^{N-1}(N-2\nu+(N-1)R)}
$$
where the sum is empty if $\frac{N-2-R}{2} < 0$.
This solution, along with Treolar's which also uses similar ideas involving the Lebesgue measure and polytope integrals, shows that the problem is accessible to everybody, without really needing to use special functions. While the authors do use some properties of three-dimensional behavior, I hold the belief that their results may be sparklingly converted to convenient results in higher dimensions. It can be phrased as a daunting, but tractable undergraduate exercise.
References
[1] A treatise on the theory of Bessel functions by G.N.Watson is a forgotten gem of a book that goes into nearly everything you'd want to know about Bessel functions, from origins to applications to trivia. Note that this book's chapter on "infinite integrals" contains rigorous proofs of all the theorems we cited and used in the first section.
[2] On the problem of random flights by J.Dutka contains details on the origin of the problem, and on some related problems as well, including the methods of Lord Rayleigh and Von Smoluchowski.
[3] A local probability problem by Prof. Justin Kluyver, where the proof presented was given, along with the amazing probabilistic relation between different lengths, which I still cannot wrap my head around.
[4] Generalized Hypergeometric functions For reference to the asymptotics. Try the case $K=2$ and see what you get.
[5] On the probability to be after $n$ random jumps of unit lengthin space within a distance of radius $r$ from the start : the problem of random flights is the outstanding contribution of Pedro and Alexander, along with some notes on Treolar's proof. Check out the references of this paper to get access to the papers of Treolar and Rayleigh.