10

I roll 6-sided dice until the sum exceeds 50. What is the expected value of the final roll?

I am not sure how to set this one up. This one is not homework, by the way, but a question I am making up that is inspired by one. I'm hoping this will help me understand what's going on better.

Hayaku
  • 135
  • 4
    Presumably the rolls are independent, in which case the answer is 3.5, the same as the expected value of any roll. – copper.hat May 03 '13 at 05:44
  • 2
    @copper.hat It's not 3.5 – Hayaku May 03 '13 at 05:50
  • Things are more complicated, in that there are six ways the final roll can "go over" 50, one each from previous sums 45,46,47,48,49,50. Each of these has a different probability of being hit exactly as the value of the second last roll, and the setup is not symmetric. – coffeemath May 03 '13 at 05:51
  • @coffeemath Is it easier if the variables are continuous instead of discrete like dice? – Hayaku May 03 '13 at 05:52
  • I would suggest putting that continuous version into your question. Maybe something like: I sample a uniform $[0,1]$ variable repeatedly and keep track of the sum so far until it exceeds some number $a>1$. What is the expected value of the last uniform variable sampled? This may be easier than the discrete problem you ask, but I don't know the answer to this continuous one either. [For the discrete one with dice it can be worked out using maple and recursive functions, but isn't nice.] – coffeemath May 03 '13 at 06:26
  • @Hayaku: Thanks, I see the issue now. Interesting problem. Entirely non-intuitive for me. – copper.hat May 03 '13 at 16:46
  • @coffeemath - '...there are six ways the final roll can "go over" 50, one each from...' this is not actually correct. From 45 there is one way to go over 50: rolling a 6. But from 49 there are 5 ways to go over 50: rolling 2-6. You are correct that the probability is different for each. – Kevin Fegan May 04 '13 at 16:24
  • @Hayaku - I'm looking for clarification about 2 points in your question: 1) you said 'I roll 6-sided dice until...'. I assume that means you will roll *one* 6-sided dice (die), or do you mean some other number of 6-sided dice? 2) you asked about '...expected value of the final roll'. I just want to make sure you are talking about expected value, and not the probability that the final roll will be any particular number. – Kevin Fegan May 04 '13 at 17:40
  • @KevinFegan : you're right, and I should have remembered that, since I was once programming a game where a player chose some number to exceed, and indeed had to add more ways from 50 to one of 51,52,etc. What made it worse was the question: how likely to say get to 47 in the first place, and then assume one doesn't roll a 1 or 2 or 3 next, etc. It was a nightmare... – coffeemath May 04 '13 at 23:47

4 Answers4

13

Let $u(n)$ be the expected value of the first roll that makes the total $\ge n$. Thus $u(n) = 7/2$ for $n \le 1$. But for $2 \le n \le 6$, conditioning on the first roll we have $$u(n) = \left( \sum_{j=1}^{n-1} u(n-j) + \sum_{j=n}^{6} j \right)/6$$
That makes $$ u_{{2}}={\frac {47}{12}},u_{{3}}={\frac {305}{72}},u_{{4}}={\frac {1919}{432}},u_{{5}}={\frac {11705}{2592}},u_{{6}}={\frac {68975}{15552 }}$$ And then for $n > 6$, again conditioning on the first roll, $$u(n) = \frac{1}{6} \sum_{j=1}^6 u(n-j)$$ The result is $$u(51) = \frac {7005104219281602775658473799867927981609}{1616562554929528121286279200913072586752} \approx 4.333333219$$ It turns out that as $n \to \infty$, $u(n) \to 13/3$.

EDIT: Note that the general solution to the recurrence $\displaystyle u(n) = \frac{1}{6} \sum_{j=1}^6 u(n-j)$ is $$ u(n) = c_0 + \sum_{j=1}^5 c_j r_j^n$$ where $r_j$ are the roots of $$\dfrac{6 r^6 - (1 + r + \ldots + r^5)}{r - 1} = 6 r^5 + 5 r^4 + 4 r^3 + 3 r^2 + 2 r + 1 = 0$$ Those all have absolute value $< 1$, so $\lim_{n \to \infty} u(n) = c_0$. Now $6 u(n+5) + 5 u(n+4) + \ldots + u(n) = (6 + 5 + \ldots + 1) c_0 = 21 c_0$ because the terms in each $r_j$ vanish. Taking $n = 1$ with the values of $u_1$ to $u_6$ above gives us $c_0 = 13/3$.

Robert Israel
  • 448,999
10

This should be a comment to Robert Israel's answer, but it is too long.

Here is a simpler way to see that, with a $d$-sided die, as $n\to\infty$, the expected last roll to meet or exceed $n$ is $\frac{2d+1}{3}$.

Since we have rolled the die an arbitrarily large number of times, each of $n{-}d,\dots,n{-}1$ are equally likely to be hit. If we hit $n{-}k$, there are $d{-}k{+}1$ ways for the next roll to total at least $n$ and the average roll that hits or exceeds $n$ is $\frac{d+k}{2}$. Thus, the probability that $n{-}k$ is the last total before we hit $n$ or above is $\frac{d-k+1}{(d+1)d/2}$. Thus, the expected last roll would be $$ \begin{align} \sum_{k=1}^d\frac{d+k}{2}\frac{d-k+1}{(d+1)d/2} &=\sum_{k=1}^d\frac{d(d+1)-k(k-1)}{d(d+1)}\\ &=\frac{2d+1}{3} \end{align} $$ For $d=6$, this yields $\frac{13}{3}$, as Robert Israel shows.


Another way of looking at this, and this may be the simplest, is that there are $k$ ways for a $k$ to be the last roll. For a $d$-sided die, the mean of the last roll would be $$ \begin{align} \frac{\displaystyle\sum_{k=1}^dk^2}{\displaystyle\sum_{k=1}^dk} &=\dfrac{\dfrac{2d^3+3d^2+d}{6}}{\dfrac{d^2+d}{2}}\\ &=\frac{2d+1}{3} \end{align} $$

robjohn
  • 345,667
  • Nice intuition! – copper.hat May 06 '13 at 05:48
  • @KevinFegan Everything seemed logical until I tried to run an MC sim to confirm 13/3. Surprisingly, it returns values approaching 5 for various Sums and Number of trials. Have a go:

    Expectation of the last roll obtained in crossing a sum with single fair die throws

    import random

    N = int(1e7) Sum = 50 exp = 0 for trial in range(N): exp += roll_till(Sum)

    print("Expected was",13/3,"but MC obtained",exp/N)

    def roll_till(S): rolls , last, s = 0,0,0 while(s<S): last = random.randint(1,7) s+=last rolls+=1 return last

    – Hex1729 Oct 06 '22 at 08:18
  • @Hex1729: with random.randint(1,7), you are using a $d=7$ sided die and $\frac{2\cdot7+1}3=5$. – robjohn Oct 06 '22 at 08:32
  • @Hex1729: By "MC sim" do you mean Monte Carlo simulation? – robjohn Oct 06 '22 at 15:33
  • Ahh damn. Thanks for pointing out. And yes, I meant Monte Carlo. – Hex1729 Oct 07 '22 at 16:16
3

I was looking at this question, and while thinking that it's a very interesting question, I thought I would not be able to provide an answer.

Then I read a comment by @coffeemath:

Things are more complicated, in that there are six ways the final roll can "go over" 50, one each from previous sums 45,46,47,48,49,50. Each of these has a different probability of being hit exactly as the value of the second last roll, and the setup is not symmetric. – 2013-05-03 06:26:36

which prompted my response:

'...there are six ways the final roll can "go over" 50, one each from...' this is not actually correct. From 45 there is one way to go over 50: rolling a 6. But from 49 there are 5 ways to go over 50: rolling 2-6. You are correct that the probability is different for each. – 2013-05-04 16:24:33

This got me to thinking about the question.

So I will not have to keep repeating it over and over here or in the comments, I'll say it once here...

We are talking about rolling one 6-sided die, so the possible numbers that will come up on any roll will be a number from 1-6 (inclusive), and the die is a "fair" (balanced) die meaning it is not weighted in such a way that would favor any number over any other number.

Next, a clarification... we are not talking specifically (exclusively) about probability. Although probability has a part in this, if we were talking exclusively about probability then the answer would be clear... on any next roll (which may be your last roll) of the die, the probability that it will be any particular number from 1-6 is the same as it is for any other number from 1-6... it is (not surprisingly) 1 in 6 (or 1/6).

So even if the last five numbers you have rolled are all 3's (33333), the probability that the next number you roll will be a 3 is the same as any of the other numbers, 1/6.

Now, you might say... whoa! the chance of rolling six 3's in a row is astronomical! Well, maybe, but consider that in this example case, you have already rolled five 3's in a row! That in itself is quite a feat. So, it's not a stretch to to consider that your next roll might just happen to be a 3. The previous five numbers that you have rolled, whether it's "33333" or "16352" (or any other five digit sequence) are already done, they have already happened, you can't change that, probability has no part in it (anymore). You could say, in a "figurative" sense, that the probability is 100%, because, hey, there it is, "33333", it did happen. The next number is independent of what has already happened, and is just as likely to be a 3, as a 2 or a 5, or any of the other numbers from 1-6.

OK, so we are talking about the "expected value" of the final roll.

Lets look at the list of possible sums, just before the final roll. The candidates are:

45, 46, 47, 48, 49, 50

We'll call them, "pre-final" sums (the sum before the final roll).

Values below 45 are not candidates for being in the list of pre-final sums because there are no numbers that can be rolled (1-6) that will increase the sum to be greater than 50.

Likewise, values above 50 are not candidates for being in the list of pre-final sums because these sums are already over 50 so for these sums, the final roll has already been made.

Here is a table showing all the possible combinations of pre-final sums, plus the numbers 1-6 (value of the next roll/possible final roll), and the possible resulting sums.

 na      na      na      na      na     45+1 = 46
 na      na      na      na     45+2    46+1 = 47
 na      na      na     45+3    46+2    47+1 = 48
 na      na     45+4    46+3    47+2    48+1 = 49
 na     45+5    46+4    47+3    48+2    49+1 = 50
45+6    46+5    47+4    48+3    49+2    50+1 = 51
46+6    47+5    48+4    49+3    50+2     x   = 52
47+6    48+5    49+4    50+3     x       x   = 53
48+6    49+5    50+4     x       x       x   = 54
49+6    50+5     x       x       x       x   = 55
50+6     x       x       x       x       x   = 56

Table positions containing na are not candidates for being in the list of pre-final sums because they are below 45.

Table positions containing x are not candidates for being in the list of pre-final sums because they are already above 50.

Note that each of the values for the pre-final sums, and each of the values for the next roll occur equally as often.

Now, we eliminate the cases where the resulting sum is less than 51, because these are not "final rolls"... they will require at least one additional roll for the sum to exceed 50.

What remains is the table representing all possible "final rolls":

45+6    46+5    47+4    48+3    49+2    50+1 = 51
46+6    47+5    48+4    49+3    50+2     x   = 52
47+6    48+5    49+4    50+3     x       x   = 53
48+6    49+5    50+4     x       x       x   = 54
49+6    50+5     x       x       x       x   = 55
50+6     x       x       x       x       x   = 56
=================================================
   6       5       4       3       2       1   21
28.5%   23.8%   19.0%   14.2%    9.5%    4.8%

To find the "expected value" of the final roll, we need to find the mean of all the possible final rolls...

There are 21 possible final rolls. Now we compute the weighted average to find the mean:

$6 + 5 + 4 + 3 + 2 + 1=21$

$(6 * 6)+(5 * 5)+(4 * 4)+(3 * 3)+(2 * 2)+(1 * 1)=91$

$\frac{91}{21} = 4\frac{1}{3}$

The "expected value" of the final roll is $4\frac{1}{3}$

  • @robjohn - sorry, I used the *mode* as the final result instead of the *mean*. I have corrected it. Thanks for catching my error. – Kevin Fegan May 05 '13 at 20:29
  • (+1) That looks much better (close to the second method in my answer). This is only valid as the number of rolls tends to $\infty$, as Robert Israel says. – robjohn May 05 '13 at 22:46
  • @robjohn - Thanks. Can you explain what you mean by *'rolls tends to $\infty$'...* it seems to me we are approaching a fixed set of numbers (51-56). These numbers are not *very large* numbers so I don't see the relation to $\infty$. – Kevin Fegan May 08 '13 at 22:56
  • this argument assumes that the arrival at $45$ through $50$ is equally likely. That is never really true, but it is closer to the case the larger the number to exceed gets. Here we want to exceed $50$. The average is more accurately $\frac{13}{3}$ if we want to exceed $100$, and more accurately if we want to exceed $1000$. – robjohn May 09 '13 at 04:12
  • @robjohn - Agreed. Since it is really a matter of the *'number of rolls'*, it would also be the case if you repeated the trial a large number of times. – Kevin Fegan May 11 '13 at 13:38
1

The purpose of this answer here is to convince readers that the distribution of the roll 'to get me over $50$' is not necessarily that of the standard 6=sided die roll.

Recall that a stopping time is a positive integer valued random variable $\tau$ for which $\{\tau \leq n \} \in \mathcal{F}_n$, where $\mathcal{F}_n = \sigma(X_0, X_1, \cdots, X_n)$ is the canonical filtration with respect to the (time-homogeneous) markov chain $X_n$.

The Strong Markov Property asserts (in this case) that conditioned on the event $\tau < \infty$, the random variables $X_1$ and $X_{\tau + 1} - X_{\tau}$ are equidistributed. Letting $\tau = -1 + \min\{k \mid X_k > 50\}$ ought to prove the equidistribution with a regular die roll, right?

Well, no. What happens is that $\tau + 1$ is a stopping time, but $\tau$ is not, because it looks into the future one timestep. This is just enough to throw off the SMP.

A Blumenthal
  • 5,048