2

Reading a book in which the author conditions on both sides of the equation of non-conditional probability and states that they still hold after conditioning without further proving. I am wondering if this is always true.


Assume we have identity:

$$\mathbb{P}(f(A_1,A_2,\ldots, A_n))=F(\mathbb{P}(f_1(A_1,A_2,\ldots, A_n)),\mathbb{P}(f_2(A_1,A_2,\ldots, A_n)),\ldots,\mathbb{P}(f_m(A_1,A_2,\ldots, A_n)))$$

where $A_i, i=1,2,\ldots,n$ are some events; $f,f_1,\ldots,f_m$ are functions on sets using set operations including $\cup$, $\cap$, $\setminus$;$F$ is a function defined on $[0,1]^m$. Then if I condition both sides on event X, do I have:

$$\mathbb{P}(f(A_1,A_2,\ldots, A_n)\mid X)=F(\mathbb{P}(f_1(A_1,A_2,\ldots, A_n)\mid X),\mathbb{P}(f_2(A_1,A_2,\ldots, A_n)\mid X),\ldots,\mathbb{P}(f_m(A_1,A_2,\ldots, A_n)\mid X))$$


Step 2: sometimes the original equation is already conditional as below:

$$\mathbb{P}(f(A_1,A_2,\ldots, A_n)\mid g(A_1,A_2,\ldots, A_n))=F(\mathbb{P}(f_1(A_1,A_2,\ldots, A_n)\mid g_1(A_1,A_2,\ldots, A_n)),\mathbb{P}(f_2(A_1,A_2,\ldots, A_n)\mid g_2(A_1,A_2,\ldots, A_n)),\ldots,\mathbb{P}(f_m(A_1,A_2,\ldots, A_n)\mid g_m(A_1,A_2,\ldots, A_n)))\tag{*}$$

then do I have the (further) conditional version:

$$\mathbb{P}(f(A_1,A_2,\ldots, A_n)\mid g(A_1,A_2,\ldots, A_n)\cap X)=F(\mathbb{P}(f_1(A_1,A_2,\ldots, A_n)\mid g_1(A_1,A_2,\ldots, A_n)\cap X),\mathbb{P}(f_2(A_1,A_2,\ldots, A_n)\mid g_2(A_1,A_2,\ldots, A_n)\cap X),\ldots,\mathbb{P}(f_m(A_1,A_2,\ldots, A_n)\mid g_m(A_1,A_2,\ldots, A_n)\cap X))$$

If any term of the original equation (*) is not conditional, simply use $f(...|X)$.


Step 3: Reverse the above statement, if "conditional version" holds true, how about the "unconditional version"?


If the above statements hold true, I can basically insert/remove conditional probability at will whenever I have an equation at hands, e.g.,

  1. Since $\mathbb{P}(A\setminus B)=\mathbb{P}(A)-\mathbb{P}(A\cap B)$, we have $\mathbb{P}(A\setminus B\mid X)=\mathbb{P}(A\mid X)-\mathbb{P}(A\cap B\mid X)$;

  2. Since $\mathbb{P}(A)=\mathbb{P}(B_1)\mathbb{P}(A\mid B_1)+\mathbb{P}(B_2)\mathbb{P}(A\mid B_2)$ where $B_1$ and $B_2$ is a partition of sample space, then $\mathbb{P}(A\mid X)=\mathbb{P}(B_1\mid X)\mathbb{P}(A\mid B_1 \cap X)+\mathbb{P}(B_2\mid X)\mathbb{P}(A\mid B_2 \cap X)$

  3. Since $\mathbb{P}(A\mid B\cap C) = \frac{\mathbb{P}(A\cap B\mid C)}{\mathbb{P}(B\mid C)}$, we have $P(A\mid B)=\frac{P(A\cap B)}{P(B)}$

There are many posts (like this and this) proving that the above statement is correct for a specific equation, but I haven't found a post handling the issue in a general way.

Nicholas
  • 363
  • A conditional distribution is still a probability distribution. So any property of unconditional probabilities $\mathbb{P}(\cdot)$ holds for conditional probabilities $\mathbb{P}(\cdot \mid X)$. – angryavian Mar 29 '19 at 17:14
  • @angryavian So this is enough as a rigorous proof? Does it use anything like three axioms of probability, don't find it enough myself lol. Also, how about the other way around (as I mentioned in step 3)? – Nicholas Mar 29 '19 at 17:18
  • @angryavian I added a bounty to the question. Please consider writing an answer, if you like. – S.H.W May 05 '23 at 16:58
  • 1
    It is not very clear what you are asking, but if the authors prove the following: whatever the probability measure $\mu$ might be, the identity [X] holds. Then you can apply the identity [X] when $\mu = \mathbf{P}$ and when $\mu(\cdot) = \mathbf{P}(\cdot \mid E)$ where $E$ is any event. If, however, the authors prove that for some specific probability measure $\mu_0,$ identity [X] holds, then you will need to reprove [X] for any other probability measure. (This comment is the essentially the same as @angryavian. ) – William M. May 05 '23 at 17:20
  • @WilliamM. How can we prove the claim? By claim I mean: "If the identity [X] holds for $\mu = \mathbf{P}$ then it holds for $\mu(\cdot) = \mathbf{P}(\cdot \mid E)$ where $E$ is an event." – S.H.W May 05 '23 at 17:41
  • Also it's well known that independence of two events does not imply that they are conditionally independent given a third event. It seems to me that the claim doesn't hold in this case. – S.H.W May 05 '23 at 17:42
  • 1
    I think it will depend on what [X] is because as @S.H.W pointed out, some identities hold for $\mathbf{P}$ but they don't hold for (some of) the condition probability measure. – William M. May 05 '23 at 17:51
  • @WilliamM. Thanks. I would appreciate if you could post an answer so I can award the bounty. Specifically, I'm looking for identities that hold for $\mathbf{P}$ and $\mu(\cdot) = \mathbf{P}(\cdot \mid E)$. – S.H.W May 05 '23 at 19:49
  • Do you assume that equality hold for some given events $A_1,\ldots,A_n$ or for every events? – Christophe Leuridan May 10 '23 at 16:35
  • @ChristopheLeuridan The equality holds for every events. Another example is from information theory. We have the chain rule $H(X,Y) = H(X) + H(Y|X)$. Then conditioned on $Z$, we have $H(X,Y|Z) = H(X|Z) + H(Y|X,Z)$. There are many usage of this conditioning on both sides of an identity. Surprisingly, conditioning doesn't work for the independent RVs. – S.H.W May 10 '23 at 19:42

0 Answers0