1

Given $X_1,X_2,X_3$ ~ $Multinomial(n, \theta_1, \theta_2, \theta_3)$ what is the conditional distribution of $X_2$ given that $X_1=x_1$?

My thoughts are:

P($X_2=x_2$ | $X_1=x_1$) = $n-x_1 \choose x_2$ $\theta_2^{x_2}$ ($1-\theta_2)^{n-x_1-x_2}$

Can someone tell me if this is correct?

EggHead
  • 667
  • "X1,X2,X3 are independent and identically distributed since they are random variables in a Multinomial Distribution" Well... no. Have a look at http://en.wikipedia.org/wiki/Multinomial_distribution. – Did Oct 21 '13 at 19:58
  • Did, wikipedia said that the sampling is with replacement, so wouldn't that mean that the random variables are independent? – EggHead Oct 21 '13 at 20:22
  • As well, if the random variables were not independent, would the distribution be hyper-geometric instead of Multinomial? – EggHead Oct 21 '13 at 20:24
  • Yes the sampling is with replacement but one samples a fixed number of times hence $X_1+X_2+X_3=n$ almost surely, which forbids $(X_1,X_2,X_3)$ to be independent. – Did Oct 22 '13 at 08:57
  • Did, great! I've edited the question based on your last comment. Could you tell me if my answer is now correct? – EggHead Oct 22 '13 at 16:16
  • Could you explain "your thoughts", that is, how you arrive at this formula? – Did Oct 22 '13 at 17:45
  • My understanding is that in a multinomial process, we are making n selections from a box of m > n objects. There are $\theta_1, \theta_2, \theta_3$ of objects $X_1, X_2, X_3$ respectively in the box. Since $X_1=x_1$ we have $n-x_1$ selections left to get $x_2$ objects of $X_2$. Since $X_2$ is binomial, I just applied the Binomial distribution formula. – EggHead Oct 22 '13 at 19:26
  • The mystery is why you do not apply the mathematical definition instead on relying on vague analogies with boxes, objects, and so on. – Did Oct 23 '13 at 06:21

1 Answers1

7

Disclaimer: This answer just received a downvote, nearly exactly three years after it was posted. Maybe one should consider this vote as a kind of birthday present? Anyway, the answer remains correct mathematically and it still addresses the question, naturally. Happy reading!

By definition, the parameters $(\theta_1,\theta_2,\theta_3)$ are nonnegative and sum to $1$, the support of the multinomial distribution is the set of the triples $(x_1,x_2,x_3)$ in $\mathbb N_0$ which sum to $n$, and, for every such triple $(x_1,x_2,x_3)$, $$ P[X_1=x_1,X_2=x_2,X_3=x_3]={n\choose x_1,x_2,x_3}\theta_1^{x_1}\theta_2^{x_2}\theta_3^{x_3}. $$ Thus, $x_3=n-x_1-x_2$ and, for every nonnegative $x_1$ and $x_2$ such that $x_1+x_2\leqslant n$, $$ P[X_1=x_1,X_2=x_2]\propto\frac{\theta_2^{x_2}\theta_3^{n-x_2-x_1}}{x_1!(n-x_1-x_2)!}, $$ up to multiplicative factors independent of $x_2$. Considering $$ k=n-x_1,\qquad p=\frac{\theta_2}{\theta_2+\theta_3}, $$ this yields $$ P[X_2=x_2\mid X_1=x_1]\propto{k\choose x_1}p^{x_2}(1-p)^{k-x_2}. $$ The sum of the RHS sum from $x_2=0$ to $x_2=k$ is $1$ hence the last $\propto$ is actually an equal sign.

For every integer $x_1$ between $0$ and $n$, the distribution of $X_2$ conditionally on $X_1=x_1$ is binomial with parameters $\left(n-x_1,\frac{\theta_2}{\theta_2+\theta_3}\right)$.

Did
  • 279,727
  • In your second equation you reference $\theta_3$ in the numerator. Could this instead be replaced with $1-\theta_1 - \theta_2$ so that it’s just in terms of $X_1$ and $X_2$? – Wil Mar 08 '21 at 03:05