11

I just learnt about the condition expectation and as is known, the definition is as follows: enter image description here

My question is for the second property (partial averaging property), what kind of intuition does it express? How can I understand in a more perceptual way. Thanks so much!

4 Answers4

3

Well, for $A$ with positive probability, we can define $\frac{1}{P(A)}\int \limits_{A} X\,dP$ as the partial average of $X$ given that the event $A$ occurs. In other words, this quantity is our best guess for the value of $X$ if all we know is that the event $A$ has occurred.

The partial averaging property you listed is equivalent to:

$$\begin{cases} \frac{1}{P(A)}\int \limits_{A} E[X|\mathcal{G}]\,dP = \frac{1}{P(A)}\int \limits_{A} X\,dP & P(A) \neq 0 \\ \int \limits_{A} E[X|\mathcal{G}]\,dP = \int \limits_{A} X\,dP = 0 & P(A) = 0 \end{cases}.$$

Note that the second line (over probability $0$ events) is trivial. For any random variables $X$ and $Y$, if $P(A) =0$, $\int_{A} X \,dP = \int_{A} Y \,dP$, since both the left hand side and right hand side equal $0$. So, we are really interested in the first equation which is over non-zero probability events. All I'm trying to say is that the following statements are equivalent:

$$(S1) \,\,\,\,\,\, \frac{1}{P(A)}\int \limits_{A} E[X|\mathcal{G}]\,dP = \frac{1}{P(A)}\int \limits_{A} X\,dP \text{ for all events } A \text{ with } P(A) \neq 0$$ and $$(S2) \,\,\,\,\,\,\int \limits_{A} E[X|\mathcal{G}]\,dP = \int \limits_{A} X\,dP \text{ for all events } A.$$

The latter ($S2$) is typically taken as the axiom defining conditional expectation, but the former ($S1$) is the one where the intuition/interpretation comes into play. With the idea that $\frac{1}{P(A)} \int_{A} Y \,dP$ is our best guess for the random variable $Y$ over the event $A$ (i.e., given the event $A$ occurs), we can interpret the "intuitively nice" form ($S1$) from above as:

The conditional expectation of $X$ over the collection of events $\mathcal{G}$ is our best guess of the value $X$ given the information in $\mathcal{G}$, and so should be the random variable for which our best guess of it, given any piece of information in $\mathcal{G}$ (i.e., given any event $A \in \mathcal{G}$), equals our best guess of $X$ given that same piece of $\mathcal{G}$-information. Keep reading this because I did in fact talk about the best guess of a best guess, which can take some getting used to. The important point is that the best guess of a random variable given information in a $\sigma$-algebra is itself a random variable, so it makes sense to talk about the best guess of the best guess.

It can easily be proved from this that there is a unique $\mathcal{G}$-measurable function such that the partial averaging property holds (and intuitively it's clear that there should only be one best guess of a random variable -- if there were more than one, then they can't both be the "best"). So, that tells us there is a unique random variable the value of which is known from the information in $\mathcal{G}$ and which has the same best guess over each piece of information in $\mathcal{G}$ as the random variable $X$.

layman
  • 20,191
1

The "full average" is:

$$\int_{\Omega} X dP,$$

Since $A\subset \Omega$, if we define a monotonically increasing sequence of $A_n$, such that $A_n\to \Omega$ and $A_0=A$, then:

$$\lim_{A_n\to\Omega}\int_{A_n} X dP= \int_{\Omega} X dP,$$

So its "partially" towards the average. The more intuitive discrete analog is when you create a weighted sum $\sum w_ix_i$, but $0\leq\sum w_i<1$. Its not really an average, because you have not corrected for the weights, but it will approach the true average as you include the rest of the possible values of $X$.

  • I feel like this answer is a stretch. On the one hand, we can talk about the partial average as the actual average but restricted over an event $A$, i.e., the quantity $\frac{1}{P(A)}\int_{A} X ,dP$. On the other hand, you are explaining the "partial" average as being "partial" in a Calculus/Analysis sense of a sequence approximating a limit point. I don't know which interpretation is right, but I especially don't see the conceptual point in establishing an axiom for conditional expectation that is based on this calculus approximation. – layman Dec 25 '16 at 02:56
1

I realise that this answer comes quite late but maybe this helps:

This property

$$\int_A E[X |\mathcal{G}] dP = \int_A X dP$$

can be rewritten as

$$E[1_AE[X |\mathcal{G}]]= E[1_A X]$$

where for $A=\Omega$ we have the standard rule for iterated expectations $$E[E[X |\mathcal{G}]]= E[X]$$

Rainymood
  • 181
1

Let $A$ have the following elements in it $(w_1,w_2,w_3)$

$E[X|G]$ is a random variable measurable in $G$.

In simplest terms, for a set $A$ in $G$, you know the value of $$E[X|G] = \frac{ X(w_1) * P(w_1) + X(w_2) * P(w_2) + X(w_3) * P(w_3)}{ P(w_1) + P(w_2) + P(w_3)}$$

i.e., expected value of $X$ , given $A$ event has occurred is product of $X(w)*P(w) $, divided by $P~$ ($~A$ event happening)

Now integral of this $E[X|G]$ which has one value for set $A$, over $A$ is :

$$4~\frac{ X(w_1) * P(w_1) + X(w_2) * P(w_2) + X(w_3) * P(w_3) } { (P(w_1) + P(w_2) + P(w_3)) * (P(w_1) + P(w_2) + P(w_3))}$$

which is nothing but

$$X(w_1) * P(w_1) + X(w_2) * P(w_2) + X(w_3) * P(w_3)$$

nmasanta
  • 9,222
Sanjay
  • 21