5

Let $X \sim Bin(n, p)$ and $Y \sim Bin(m, p)$.

How is

$$Z_1 = \frac{X}{n} - \frac{Y}{m}$$

and

$$Z_2 = \left|\frac{X}{n} - \frac{Y}{m}\right|$$

distributed? (Hence: What is their cumulative distribution function?)

Background

I am mainly interested in this question and answering the question here will help me for the other one (I hope).

My thoughts

The range of values of $Z_1$ is $(-1, 1)$ and of $Z_2$ is $(0, 2)$. Both are discrete random variables.

\begin{align} \mathbb{E} (Z_1) &= \mathbb{E}\left(\frac{X}{n} - \frac{Y}{m}\right) \\ &= \frac{1}{n} \mathbb{E}(X) - \frac{1}{m} \mathbb{E}(Y) \\ &= p - p \\ &= 0\\ F_X(x) & =\operatorname P(X\le x) = \sum_{k=0}^{\lfloor x \rfloor}\binom nk p^k (1-p)^{n-k}\\ F_{X/n}(x) &= P\left(\frac{X}{n} \leq x\right)\\ &= P(X \leq nx)\\ &= \sum_{k=0}^{\lfloor nx \rfloor}\binom nk p^k (1-p)^{n-k}\\ F_{Z_1}(x) &= P\left(\frac{X}{n} - \frac{Y}{m} \leq x\right)\\ &= ?\\ \end{align}

Is there probably some continuous approximation of those discrete distributions?

M.Mass
  • 2,672
Martin Thoma
  • 9,821
  • Did you consider an approximation via normal distribution? In this case $X/n\approx N(p,p(1-p)/n)$. In case of independence $Z_1\approx N(0,p(1-p)/n+p(1-p)/m)$. – Michael Hoppe Dec 10 '17 at 13:21
  • @MichaelHoppe No, not so far. But with n and m >= 50 it should be applicable, right? – Martin Thoma Dec 11 '17 at 15:14
  • @MartinThoma As long as the variances are more then $9$, see https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation – Michael Hoppe Dec 11 '17 at 18:55
  • As $var(X) = n \cdot p \cdot (1-p)$, for $n=50$ I would need p in about $[0.235, 0.765]$. For $n=100$ I would need $p$ to be in $[0.10, 0.90]$. Right? – Martin Thoma Dec 12 '17 at 10:05

0 Answers0