4

Imagine we repeat the following loop some thousands of times:

$$ \begin{align} & \text{array} = []\\ & \text{for n} = 1: 10 000 \\ & k = 0 \\ & \text{while unifrnd}(0,1) < 0.3 \\ & k = k + 1 \\ & \text{end} \\ & \text{if k} \neq 0 \\ & \text{array} = [\text{array,k}] \\ & \text{end} \\ \end{align} $$

whereas "unifrnd(0,1)" means a random number drawn from the uniform distribution on the unit interval.

My question is then: What is then the value of k, which is the most often observed - except for k = 0? And is that the expectation of k?

Thanks very much

  • 1
  • Not sure what $k$ starts from - you didn't initialize it. Also if you repeat it infinite amount of times, $k \to \infty$. Amount of times you see each number has a geometric distribution with parameter 0.3. – gt6989b May 29 '13 at 12:58
  • thanks edited now: k is initalized with 0, the loop would be repeated e.g. 10 000 times. –  May 29 '13 at 12:59
  • Is the highest bin in the geometric distribution also its expected value? It seems to me that that's not true –  May 29 '13 at 13:29
  • This is not a geometric distribution but a random walk. Notice that the value of $k$ stays at the same value if the draw of the unifrnd is less than 1/3. The geometric distribution would be relevant if the code were to be written such that $k$ does not get updated (either stays at same value or moves to next value) for the next iteration until and unless we draw a uniform variate less than 1/3. – response May 29 '13 at 14:04
  • 1
    When you draw a random greater than $0.3$ do you just avoid incrementing $k$ that time (in which case you need a termination criterion for your loop) or do you exit the loop. The other answers, probably keying on your comment that the loop would be repeated 10,000 times assumed the first. I assumed the second. Please clarify. – Ross Millikan May 29 '13 at 14:17
  • When I dras a random greater then 0.3, I avoid to increment k that time and exit the loop. The k I had is stored in the an the array, the while loop gets repeated until the until the next random number is greater then 0.3, whereas the next k would get stored in the next cell of the array. Is this a clear enough explanation? –  May 29 '13 at 14:19
  • I've edit the loop, hope it is clearer now –  May 29 '13 at 14:27

4 Answers4

3

It appears you exit the loop the first time the random is greater than $0.3$. In that case, the most probable value for $k$ is $0$. It occurs with probability $0.7$. The next most probable is $1$, which occurs with probability $0.3 \cdot 0.7$, because you need the first random to be less than $0.3$ and the second to be $0.7$. In general, the probability of a value $k$ is $0.3^k\cdot 0.7$

Ross Millikan
  • 374,822
  • Absolutely do agree (as I have also written below in a comment), but then it seems to me to be indeed the geometric distribution (one response says no), don't you think so? –  May 29 '13 at 14:16
  • 1
    @TestGuest: Yes, it is a geometric distribution. It depends on the behavior of your loop, as I put in the comment to your question. The way I read it, you draw $k+1$ randoms, not $10,000$ – Ross Millikan May 29 '13 at 14:19
  • that is correct –  May 29 '13 at 14:20
0

Since the events "unifrand(0,1) > 0.3" (which lead to a repetition of the same value of $k$) can come in any order and all these orders are equally likely, there is no special value of $k$ that occurs most often.

Edit: With the code as it reads now, the value $k = 1$ will be the most frequent one, and $k$ has indeed a geometric distribution.

Hans Engler
  • 15,439
  • I am not sure I agree. If we would consider k = 0 e.g., the probability would be 0.7. If I consider k =1, the probability would be (0.3)0.7, for k = 2 it would be 0.7(0,3)^2 etc. –  May 29 '13 at 13:24
  • @ TestGuest - see my edit. – Hans Engler May 29 '13 at 16:23
0

You can model your process as a random walk with unit steps in the positive direction. In other words, define independent random variables $Z_i$, $i \in 1, 2, n$ such that:

$P(Z_i = 1) = \frac{1}{3}$ and

$P(Z_i = 0) = \frac{2}{3}$

Set the initial position of the walk as $S_0 = 0$ and define:

$S_i = S_0 + \sum_{j=1}^iZ_j$

Your question then reduces to investigating the behavior of the random variable $S_n$. Specifically, you are asking for the mode of $S_n$ and $E(S_n)$.

By linearity of expectations, it can be seen that:

$E(S_n) = \frac{n}{3}$

My intuition suggests that the mode will coincide with the mean.

response
  • 5,071
  • I still think it's the geometric distribution. –  May 29 '13 at 14:21
  • Given your comments to Ross, I agree. It was not entirely clear from the code what you were doing. Perhaps, posting the actual code snippet would have helped avoid the confusion. – response May 29 '13 at 14:23
  • I'm sorry, edited. And thanks nevertheless –  May 29 '13 at 14:30
0

if you want the number that is most likely to be made out of a sum like this. you need combinations to get that sum * probability of each combination. So,

$${n\choose m}*.3^m*.7^{n-m}$$

should be maximized. according to wolfram alpha this is not exactly .3*n http://www.wolframalpha.com/input/?i=maximise+%28choose%28200%2Cn%29.3^n.7^%28200-n%29%29

that's the best I can do.

Edit: You changed your question, and I anticipate one more change at least. I think you are looking for the expected value of k, rather than the most often observed value. So here is how you calculate that:

$$\sum_{n\to \infty }{n*Pr(n)}$$ where $$Pr(n) = .3^n$$

let $r=.3$ and we can psuedoexpand our series (to make it clearer what we are doing) to: $$ r+2r^2+3r^3+...$$ or $$r+r^2+r^3+...$$ $$r^2+r^3+...$$ $$r^3+...$$ then we have ${\frac{r}{1-r}}+r{\frac{r}{1-r}}+r^2{\frac{r}{1-r}}+... = {\frac{r}{(1-r)^2}}=30/49$ which is probably the answer you want...

ZKe
  • 800