0

Suppose that I have a random variable $X = \sin(T)$ where $T$ was drawn from the uniform distribution on $[0,2\pi)$.

Upon generating samples for this random variable, the usual practice you see is to generate a pseudorandom, unsigned, $b$-bit integer and then multiply the integer by $\frac{2\pi}{2^b}$ to get $T$. Then I'd pass $T$ to $\sin$ to get $X$.

Is there any compelling reason why I couldn't simply pass the pseudorandom integer directly to $\sin$? The function $\sin(n)$ is completely aperiodic for integer $n$, so it seems like this should have the same effect.

1 Answers1

2

Well, generating it the usual way ensures that the distribution is "as close" to a uniform distribution as it possibly could be. So, you can be sure of that method, assuming your psuedorandom number generator is behaving itself. If we wanted to be really formal about it, we could note that, as $b$ goes to $\infty$, the probability of the generated number being in some specified interval tends towards the uniform distribution on the circle.

In fact, as you are seeing, this formal property holds equally true when we use the algorithm of generating a random integer. In particular, if we chose some interval (or arc, really) $I$ on the circle and considered the limiting distribution of choosing a random integer in $[0,n]$ as $n$ goes to infinity, we get a uniform distribution - this is because, the set of points on the circle with representations as an integer angle is dense on the circle, as you are noticing - and clearly, the probability of a randomly chosen $x$ by the limiting distribution of your process being in $I$, it's clear that $P(x\in I)=P(x\in I+n)$ where $I+n$ is a shift of $I$ by an integer angle $n$ (that is, the probability that $x$ is in some interval is the same as the probability of it being in another interval). This translation invariance tells us that the limiting distribution must be the uniform distribution on the circle.

So, in theory, we might think that either method is justified - however, the problem is that the first method has nice properties at every $b$ leading up to the limit - at least by rotations of $\frac{2\pi}{2^b}$, we have the probability be invariant. The second method doesn't have this property; in particular, we would expect there to be a bias towards certain values, since we can't regard that a full cycle around the circle yields an entirely unrelated set of data to the first revolution - notice that, if we set $n=11$, we have about two rotations about the circle, but it happens that, since $2\pi-6 \approx \frac{1}4$, the last $6$ points will be the same as the first $6$, but shifted backwards by $2\pi-6$. This extends to higher $n$, but not as clearly. Below is a picture of integer angles in $[0,11]$, with the first 6 points in blue and the last 6 in red:

$\hskip1.4in$Distribution on 12 points

This is clearly not evenly spaced, as we would desire. Though, as we let $n$ go to $\infty$, this approaches the same distribution as the typical method, the typical method is "closer" to the uniform distribution for finite $n$.

(The above assumes you only ever measure the probability of $x$ being in an open interval. If you're doing computation, this is a good enough assumption)

Milo Brandt
  • 60,888