16

The proof of linearity for expectation given random variables are independent is intuitive. What is the proof given there they are dependent?

Formally, $$ E(X+Y)=E(X)+E(Y)$$ where $X$ and $Y$ are dependent random variables.

The proof below assumes that $X$ and $Y$ belong to the sample space. That is, they map from the sample space to a real number line. Is that also a condition for linearity of expectation?

Proof: $$E\left(X+Y\right) =\sum\limits_{s}\left(X+Y\right)\left(s\right) P\left({s}\right) $$ $$E\left(X+Y\right) =\sum\limits_{s}\left(X\left(s\right)+Y\left(s\right)\right) P\left({s}\right) $$ $$E\left(X+Y\right) =\sum\limits_{s} X\left(s\right)P\left({s}\right) + \sum\limits_{s} Y\left(s\right)P\left({s}\right) $$ $$E\left(X+Y\right) =E\left(X\right)+E\left(Y\right)$$ Here $S$ is the sample space and $s$ is an event in the sample space.

Reference Lecture for proof.

Also, more reasoning for step 2 would be helpful. I don't understand it completely.

1 Answers1

27

The proof below assumes that $X$ and $Y$ belong to the sample space. That is, they map from the sample space to a real number line. Is that also a condition for linearity of expectation?

No.   It's the definition of a random variable.

Basically any random variable $X$ is a function that maps the sample space to the reals (or a subset there of, called the support).   $$X: \Omega \mapsto \Bbb R$$

If $X$ and $Y$ are both random variables of the same sample space, then so is their sum. $X+Y$.   (That is not defined if they are not of the same sample space.)  

$$ X:\Omega\mapsto\Bbb R~\wedge~ Y:\Omega\mapsto \Bbb R ~~\implies~~ X+Y:\Omega\mapsto\Bbb R\\\forall s\in\Omega,\quad(X+Y)(s) := X(s)+Y(s)$$

Linearity of Expectation then follows from its definition.

$\begin{align} \mathsf E(X+Y) =&~ \sum_{\omega\in\Omega} (X+Y)(\omega)~\mathsf P(\omega) \\[1ex] =&~ \sum_{\omega\in \Omega} X(\omega)~\mathsf P(\omega)+\sum_{\omega\in \Omega} Y(\omega)~\mathsf P(\omega) \\[1ex] =&~ \mathsf E(X)+\mathsf E(Y) \end{align}$

Of course, this is for discrete random variables.   For continuous random variables we use integration , but everything is analogous by no coincidence.

$\begin{align} \mathsf E(X+Y) =&~ \int_{\Omega} (X+Y)(\omega)~\mathsf P(\mathrm d \omega) \\[1ex] =&~ \int_{\Omega} X(\omega)~\mathsf P(\mathrm d \omega)+\int_{\Omega} Y(\omega)~\mathsf P(\mathrm d \omega) \\[1ex] =&~ \mathsf E(X)+\mathsf E(Y) \end{align}$

BCLC
  • 13,459
Graham Kemp
  • 129,094
  • 1
    $Y$ might map from a different sample space $\Omega'$, but it only makes sense to talk about $X+Y$ if they are defined on the same sample space. So yes, you do have to assume $X$ and $Y$ map from the same sample space, the only reason being $X+Y$ is not well-defined otherwise. – kccu Jun 03 '16 at 00:01
  • @kccu ahha, of course! I didn't think hard enough. Thanks! – Abhishek Bhatia Jun 03 '16 at 00:07
  • what about the $\mathbb{P}$ bit? Do they have to be the same? So, let's say we have two Gaussian distributions, where should the formula for the Gaussians be put? Are they part of $\mathbb{P}$, or are they part of $X$ and $Y$? As far as I know, they should be part of $\mathbb{P}$, and then $X$ will be something like simply the Lebesgue measure, eg effectively simply something like $\omega$? – Hugh Perkins Mar 06 '17 at 10:03
  • Oh, I think I figured it out. If we have two independent gaussians, that's modeling two independent "things" happening, and therefore we need to add all possible outcomes, of each "thing" to the sample space. And we'll need to add all possible pairs of values. So basically the outcome space, instead of being eg $\mathbb{R}$ will become eg $\mathbb{R}^2$, with one axis for each of the things we want to measure, each of the two Gaussians. Then $X$ will be a projection of $\omega$ onto the first real axis, and $Y$ will be a projection onto the second axis. $\mathbb{P}$ will be the joint prob. – Hugh Perkins Mar 06 '17 at 11:56
  • 4
    Yes, for any continuous random variables with a definite joint probability density function, $f_{X,Y}(x,y)$, the above can be written: $$\def\P{\mathop{mathsf P}}\def\E{\mathop{\mathsf E}}\begin{align}\def\d{\mathop{\mathrm d}}\E(X+Y) ~&=~ \iint_{\Bbb R^2} (x+y)f_{X,Y}(x,y)\d x\d y\[1ex] &=~ \iint_{\Bbb R^2} x,f_{X,Y}(x,y)\d x\d y+\iint_{\Bbb R^2} y,f_{X,Y}(x,y)\d x\d y\[1ex] &=~ \int_\Bbb R x ,f_X(x)\d x+\int_\Bbb R y,f_Y(y)\d y\[1ex] &=~ \E(X)+\E(Y)\end{align}$$ – Graham Kemp Mar 06 '17 at 14:39
  • @GrahamKemp, I am a little confused about what we are integrating over. Based on the definition, isn't the first integral really $\int_{x+y}$? And does $\int_{x+y} = \int_x \int_y$? – jds Mar 18 '19 at 13:38
  • 1
    @gwg If you prefer:$$\begin{align}\mathsf E(X+Y)&=\int_\Bbb R z~f_{X+Y}(z)\mathsf d z\&=\iint_{\Bbb R^2} z~f_{X+Y,Y}(z,y)\mathsf d z\mathsf d y\&= \iint_{\Bbb R^2} z~f_{X,Y}(z-y,y)\mathsf d z\mathsf d y\&=\iint_{\Bbb R^2} (x+y)~f_{X,Y}(x,y)\mathsf d x\mathsf d y\&\vdots\&=\mathsf E(X)+\mathsf E(Y)\end{align}$$ – Graham Kemp Mar 20 '19 at 03:44
  • @GrahamKemp Can you please explain how do we go from the 2nd line to the 3rd (in the original comment)? Why does y vanish from the 1st integral (and x from the 2nd)? I've been trying to understand it for some time but to no avail. –  Apr 19 '19 at 22:18
  • 5
    Law of Total Probability$$\int_\Bbb R f_{X,Y}(x,y)~\mathrm d x = f_Y(y)$$Also known as Marginalization. – Graham Kemp Apr 20 '19 at 05:06
  • $\sum_{\omega \in \Omega}$ does not make sense if $\Omega$ is uncountably infinite, which can happen even if $X(\omega)$ is a discrete random variable (such as a Bernoulli random variable). – Michael Jun 03 '21 at 11:12
  • For anyone wondering about the continuous form, it uses the definition of the Expected Value of a continuous r.v. in terms of the Lebesgue integral, which is explained a bit more here: https://math.stackexchange.com/q/1530493/244555 – Abhishek Divekar Aug 19 '23 at 06:20