1

There are n objects in a list. N random numbers are generated, each pointing to a height in the list. Each time an object gets "hit", it gets 1 point.

What is the probability that an object gets: 1 point, 2 points, 3 points, etc.

I wrote a script that tests it. The result for 1 million objects, so, 36%, 18%, 6%, 1.5%, 0.3%, etc. How is this calculated mathematically? I asked this question, and it was closed, but there is no reason to close it.

population: 1000000
results: 
1:  368282 ,  36.8282 %
2:  184036 ,  18.4036 %
3:  61046 ,  6.1046000000000005 %
4:  15465 ,  1.5465 %
5:  3009 ,  0.3009 %
6:  519 ,  0.0519 %
7:  56 ,  0.0056 %
8:  11 ,  0.0011 %
9:  1 ,  0.000099 %

The script,

# cook your dish here

import math import random

population = 10*6 registry = [0]population count = []

for i in range(population): randomNumber = random.randint(0, population-1) registry[randomNumber] += 1 if len(count) < registry[randomNumber]: count.append(0)

for i in range(population): if registry[i] != 0: count[registry[i]-1]+=1

print("population:", population) print("results: ") for i in range(len(count)): print(i+1, ": ", count[i], ", ", count[i]/population*100, "%")

Johan
  • 41
  • To be clear, you have $n$ objects, they are sampled with replacement $n$ times uniformly and independently at random, and you ask what the probability is that a specific object (e.g. the first) is selected $k$ times? $\dfrac{\binom{n}{k}(n-1)^{n-k}}{n^n}$. For $k=1$, letting $n\to\infty$ that simplifies nicely to $\lim\limits_{n\to\infty}\dfrac{n(n-1)^{n-1}}{n^n}=\lim\limits_{n\to\infty}(1-\frac{1}{n})^{n-1}=\frac{1}{e}\approx 0.367879\dots$, very close to your experimental value. These other values will also be closely related. – JMoravitz Jul 08 '21 at 14:28
  • For $k=2$, that would be $\frac{n^2-n}{2}$ in place of the binomial coefficient, effectively only the leading term of which survives the limiting process so might as well have been $\frac{n^2}{2}$, so it will be $\frac{e^{-1}}{2}\approx 0.18393972\dots$, and for $k=3$ similarly it will effectively be $\frac{n^3}{3!}$ in place of the binomial coefficient making the result $\frac{e^{-1}}{3!}\approx 0.06131324\dots$ and in general $\frac{n^k}{k!}$, making the limit $\frac{e^{-1}}{k!}$. – JMoravitz Jul 08 '21 at 14:33
  • Great, thank you. Exactly what I was looking for.

    What do you mean by "with replacement" btw?

    – Johan Jul 08 '21 at 14:42
  • By "with replacement" I mean as opposed to without replacement... as in, once an object is selected it is allowed to be selected again when it is sampled with replacement. Without replacement, once an object is selected it is not allowed to be selected again. Of course, selecting $n$ objects $n$ times without replacement, everything is selected exactly once each, which is rather uninteresting. I only used the language to tie it in with other counting and probability problems to justify using the $n^n$ in the denominator. – JMoravitz Jul 08 '21 at 14:51
  • As an aside, in your results you had in your output 3: 61046 , 6.1046000000000005 % but this is just supposed to be a ratio, $\frac{61046}{1000000}$ which is clearly $6.1046%$ exactly. Be careful with doing floating point arithmetic where it isn't necessary as it can cause oddities like this. See https://0.30000000000000004.com/ and this SO post. – JMoravitz Jul 08 '21 at 15:16
  • Thanks. Never heard the term. But I mostly engage in randomization topic from my own thinking, good to hear what terms are the standard terms. – Johan Jul 08 '21 at 16:10
  • re: floating point, yes good point, should have noticed that when copy+pasting the result. – Johan Jul 08 '21 at 16:20
  • Overall, super thankful. Perfect mathematical description. I am producing a solution to consensus random number generation, using one-person-one-vote. You really helped me along the way. – Johan Jul 08 '21 at 16:22

1 Answers1

1

Collecting comments,

The probability a specific object was sampled $k$ times with replacement can be seen by counting techniques or binomial distribution as being $$\Pr(X=k)=\frac{\binom{n}{k}(n-1)^{n-k}}{n^n}$$

This, seen by choosing which $k$ of the samplings resulted in our desired object, for the remaining $n-k$ of the samplings choosing which other object was chosen, taken over the $n^n$ different possible ways of sampling our $n$ objects.

We can investigate what happens as we let $n$ continue to grow large. For this, we must first recognize that $\binom{n}{k}=\dfrac{n(n-1)(n-2)\cdots(n-k+1)}{k!}$. In particular, $\binom{n}{k}\sim \frac{n^k}{k!}$ keeping only the first term after expansion as all other terms are of lower order which disappear as we take the limit anyways.

We have $\lim\limits_{n\to\infty}\dfrac{\binom{n}{k}(n-1)^{n-k}}{n^n}=\lim\limits_{n\to\infty}\dfrac{\frac{n^k}{k!}(n-1)^{n-k}}{n^n}=\frac{1}{k!}\lim\limits_{n\to\infty}(1-\frac{1}{n})^{n-k}=\frac{1}{k!}\cdot e^{-1}$

Indeed, this matches closely with your experimental results.

$\begin{array}{c|c|c}k&\text{Your experimental result}&\text{Exact theoretical result}\\\hline 0&.367575&\frac{1}{e}\approx0.36787944\dots\\1&.368282&\frac{1}{e}\approx0.36787944\dots\\ 2&.184036&\frac{1}{2e}\approx0.18393972\dots\\ 3&.061046&\frac{1}{3!e}\approx0.06131324\dots\\ 4&.015465&\frac{1}{4!e}\approx0.01532831\dots\\ 5&.003009&\frac{1}{5!e}\approx0.00306566\dots\\ 6&.000519&\frac{1}{6!e}\approx0.00051094\dots\\ 7&.000056&\frac{1}{7!e}\approx0.00007299\dots\\ \vdots\end{array}$

(Note: $0! = 1$ and so the formula above works for $k=0$ as well. We have in this case that $\Pr(X=0)=\Pr(X=1)=\frac{1}{e}$)

Note further that the expected number of these elements who were never chosen or were chosen exactly once etc... will simply be $n$ times their respective probability of a particular element being chosen that many times as per the linearity of expectation.

JMoravitz
  • 79,518
  • If the objects are sampled x times instead, what parameter in ((n/k)(n-1)^(n-k))/n^n is changed? I tried simulating it for x = n/2, and got 30% k = 1, 7.5% k =2, 1.26% k = 3, 0.15% k = 4. So, 4x less, 6x less, 8x less. – Johan Jul 10 '21 at 03:00
  • (((x^k)/(k!))(n-1)^(x-k))/n^x gives me the test results. ((x/k)(n-1)^(x-k))/n^x does not though. – Johan Jul 10 '21 at 04:09
  • OK. I found this, (e^(-1/x) x^(-k))/k!. By using ratios of n only, and x as the divisor. So, n/2, n/3, etc. The test results are for 1/(2 sqrt(e)), 1/(8 sqrt(e)), 1/(48 sqrt(e)), 1/(384 sqrt(e)), each 4x less, 6x less, 8x less. – Johan Jul 10 '21 at 04:24