0

Considering this discrete function:

$$f(n)=\frac{N!}{n!(N-n)!}p^n q^{N-n}$$

Where $n \lt N\ $, $\ p+q=1\ $, and $\ N,n,p,q\gt0$.

Is

$$|f(n+1)-f(n)| \ll f(n)$$

the condition which allows the approximation of $f(n)$ with a continuous function $h(n)$ when n is very big?

If yes (which I think is the case),

$$f(n)=\frac{N!}{n!(N-n)!}p^n q^{N-n}$$ $$f(n+1)=\frac{N!}{(n+1)!(N-n-1)!}p^{n+1} q^{N-n-1}$$ $$=\frac{N!(N-n-1)}{(n+1)n!(N-n)!}p^n q^{N-n} \frac{p}{q}$$ $$=f(n)\frac{N-n-1}{n+1}\frac{p}{q}$$ $$\implies f(n)\left|\frac{(N-n-1)p}{(n+1)q}\right|\ll f(n)$$ If $n$ is big, $$\implies \left|\frac{(N-n)p}{nq}\right|\ll 1$$ $$(N-n)p\ll nq$$ $$Np\ll n$$

But we have $n<N$, this is a contradiction.

But, the function described above, in the limit of very big $n$ gets like the Gaussian bell function. And CAN be approximated with a continuous function (which is like $e^{-x^2}$).

What am I doing wrong?

AHB
  • 1,469
  • "What am I doing wrong?" The trouble might lie in some sloppiness about the relevant result, which is not that $$f_N(n)\sim h(n)$$ for some continuous function $h$, but rather $$f_N\left(\left\lfloor Np+x\sqrt{Np(1-p)}\right\rfloor\right)\to h(x)$$ when $N\to\infty$, for every fixed $x$. – Did Jul 30 '17 at 08:49
  • @Did I couldn't understand the second math line. :-confused – AHB Jul 30 '17 at 09:21

1 Answers1

1

You have $f(n+1) = f(n) \frac{(N-n-1)p}{(n+1)q}$ correctly. So

$$|f(n) - f(n+1)| = \left|1 - \frac{(N-n-1)p}{(n+1)q}\right|f(n)$$

So you don't want to show that $\frac{(N-n-1)p}{(n+1)q} \ll 1$, you want to show that $\left|1 - \frac{(N-n-1)p}{(n+1)q}\right| \ll 1$, which is quite different. Indeed, if we replace $n+1$ with $n$:

$$1 - \frac{(N-n)p}{nq} = \frac{nq + np - Np}{nq} $$ and we want this to be approximately 0, we must have $$\frac{nq + np - Np}{nq} = \epsilon \implies n(q+p) - Np = n \epsilon q$$ $$\implies n = \frac{Np}{q+p+\epsilon q}$$ So, your desired property only holds in the vicinity of $ n \approx N\frac{p}{q+p}$! But this also happens to be the peak of the distribution.

So the bottom line is that, near the peak, it can be well approximated by a continuous function; far away from the peak, the distribution is actually dropping a factor of 2 or 10 or 100 from one point to the next, so integrating a continuous function in that region is pretty inaccurate. But since your $f(n)$ is mostly useful as a probability distribution, nearly all of the distribution is near the peak (which actually gets very very narrow for large $N$), so in all of the region that actually contributes to any expectation values or medians, it is well-approximated by a continuous function.

Alex Meiburg
  • 2,503
  • 11
  • 21
  • what a silly mistake I did! Thank you – AHB Jul 30 '17 at 09:13
  • 2
    hmmm. Why didn't you use $np + nq=n$? Was it deliberate? – AHB Jul 30 '17 at 09:19
  • Ah, forgot that fact. :) Yes, it could simplify it a bit more, silly me! – Alex Meiburg Jul 30 '17 at 09:50
  • I plotted a list of ordered pairs $(n,f(n))$. When I set $N=10000$ for example, the peak almost disappears! and the bell more looks like a continuous function in the two sides rather than the peak! Again what am I doing wrong? – AHB Jul 30 '17 at 10:27
  • That is correct behavior. As I describe towards the end, it only satisfies the criterion you give near the peak. Try putting the range n=90...100 when p=0.5, N=1000. You will see that indeed it is not 'smooth' the way you want. But when drawing the whole curve, the peak is so high that it squashes everything else down and makes it look smooth. The peak shrinks in relative width as N grows, yes, but it's width is approximately sqrt(N): so it does, in fact, get wider in absolute width, and so it gets smoother and smoother. – Alex Meiburg Jul 30 '17 at 13:57