0

There are 2 buckets. Each bucket has $n$ balls. You pick balls at random out of each bucket. You choose a bucket randomly to take a ball out of. What is the expected number of balls remaining in the non-empty bucket when the other bucket is empty? Assume $n \gg 1$

Let $Y$ denote the number of balls in the remaining bucket. It could have $1,2,\ldots, n$. I believe $Y$ follows a binomial distribution. But I am lost on how to represent the binomial distribution.

What is the number of independent trials here? At first, I thought it was the number of balls removed from both buckets, but for each $Y$ value, there is a different number of trials. For example, when $Y = 1$, this means we removed $2n - 1$ balls, so $2n-1$ is the number of trials. In general, for $Y = k$, $2n - k$ is the number of trials. The number of successes is $k$. But with this approach, the number of trials and successes is dependent on $Y$. Is there a better way to define the successes and number of trials for this problem?

  • This is known as the Banach Matchbox problem. See also, this question – lulu Aug 18 '20 at 19:45
  • @lulu Ah so one slight intuition mistake I had is that the problem actually follows a negative binomial rather than a plain vanilla binomial? – roulette01 Aug 18 '20 at 19:54
  • It's neither, though the standard solution does make use of a negative binomial. – lulu Aug 18 '20 at 19:55
  • @lulu In the wikipedia link, it defines $M$ as the number of successes and $N+1$ as the number of failures. $M$ here represents the number of balls taken out of the non-empty bucket. Shouldn't the number of successes be the number of times we choose the bucket that becomes empty eventually, which would mean it's $N+1$ instead? – roulette01 Aug 18 '20 at 20:13
  • @lulu On second thought. This doesn't matter because of the symmetry of the binomial coefficient? – roulette01 Aug 18 '20 at 20:16
  • That solution calls it a success if you draw from the infinite bucket and a fail if you draw from the finite one. I actually prefer the argument given in the accepted answer to the other post I linked to. – lulu Aug 18 '20 at 20:19
  • @lulu Just looked at the other link. That one seems aligned with what I was thinking. I would've probably forgot to do the last part. This post doesn't find the expectation, but it seems central limit theorem would be good approximation here? – roulette01 Aug 18 '20 at 20:42
  • Sterling's formula should work well to get the expectation. I don't think there is a short cut to it, but maybe I am wrong about that. – lulu Aug 18 '20 at 20:58
  • @lulu Yeah I saw that referenced in the wikipedia link. In the source where I got this problem from it has a hint saying "Use CLT of the binomial distribution." I'm actually not sure now that applies here. When I've converted binomial to a normal pdf, typically the normal pdf is distributed $N(np, np(1-p))$, where $n$ is the number of trials. The issue with the current case is the number of trials is varying. – roulette01 Aug 18 '20 at 21:00
  • Sterling seems the obvious way to go, at least to me. But I haven't actually written it out, so maybe it's harder than I imagine. – lulu Aug 18 '20 at 21:03
  • The upvoted answer to this question follows the Sterling approach, looks straight forward. – lulu Aug 18 '20 at 21:06
  • @lulu I think there might be a mistake somewhere in their calculation. It seems the answer they got is about 1 larger than the wikipedia and what I get with monte carlo. – roulette01 Aug 18 '20 at 22:47
  • That's possible, I have not checked it carefully. But it's just an asymptotic formula..an additive factor of $1$ wouldn't signify. – lulu Aug 18 '20 at 22:53

0 Answers0