33

I wish to use the Computational formula of the variance to calculate the variance of a normal-distributed function. For this, I need the expected value of $X$ as well as the one of $X^2$. Intuitively, I would have assumed that $E(X^2)$ is always equal to $E(X)^2$. In fact, I cannot imagine how they could be different.

Could you explain how this is possible, e.g. with an example?

Nova
  • 447
  • 17
    If two random variables $X, Y$ are independent, then it is indeed true that $\mathbb{E}(XY) = \mathbb{E}(X) \mathbb{E}(Y)$. But $X$ is as far as possible from being independent of itself! – Qiaochu Yuan May 25 '12 at 17:44
  • 5
    Just take a simple example: if we have $1$ and $2$ as being equally probable, then $\left(\frac{1+2}{2}\right)^2 = \frac{9}{4}$ but $\frac{1^2 + 2^2}{2} = \frac{5}{2}$. – BlueRaja - Danny Pflughoeft May 25 '12 at 20:50
  • 15
    No offense, but if you tried anything at all (even just blindly guessing a distribution) and computed $E(X^2)$ and $E(X)^2$, you would almost surely have found an example on your own. Experimentation is a very useful tool in mathematics. –  May 26 '12 at 00:28
  • Jensens inequality. See http://mathoverflow.net/questions/47258/when-is-the-function-of-a-median-closer-to-the-median-of-the-function-than-the-me – David LeBauer May 26 '12 at 22:55
  • 1
    @David: No, Cauchy-Schwarz. (Besides, indicating this MO page as a reference for Cauchy-Schwarz or for Jensen is, at best, a joke.) – Did Jul 25 '12 at 23:51
  • $\int f^2\ne(\int f)^2$ – Martín-Blas Pérez Pinilla Feb 04 '16 at 12:29

10 Answers10

33

Assume $X$ is a random variable that is 0 half the time and 1 half the time. Then $$EX = 0.5 \times 0 + 0.5 \times 1 = 0.5$$ so that $$(EX)^2 = 0.25,$$ whereas on the other hand $$E(X^2) = 0.5 \times 0^2 + 0.5 \times 1^2 = 0.5.$$ By the way, since $Var(X) = E[(X - \mu)^2] = \sum_x (x - \mu)^2 P(x)$, the only way the variance could ever be 0 in the discrete case is when $X$ is constant.

GeoffDS
  • 11,270
27

Let $EX=\mu$ and $E(X-\mu)^2=\sigma^2$, then

$$ EX^2 = E[X-\mu+\mu]^2=\\ =E(X-\mu)^2+2E[(X-\mu)\mu]+E(\mu^2)=\\=\sigma^2+2\mu E(X-\mu)+\mu^2=\\ =\sigma^2+\mu^2 $$

So $EX^2 =\sigma^2+\mu^2$, no matter the distribution, and $EX^2\ne(EX)^2$ unless the variance equals zero.

Guile
  • 371
10

One is an average of squares, the other a square of an average. In general, when you reverse two procedures (mix cookies, bake cookies), you have no right to expect the same outcome.

ncmathsadist
  • 49,383
9

Note that your logic applied to a uniform distribution would give that $$(x_1+x_2+\cdots+x_n)^2=n({x_1}^2+{x_2}^2+\cdots+{x_n}^2)$$ which is clearly not true in general.

  • 1
    What are the dots for? – SBF May 25 '12 at 20:54
  • whoops. Meant to include them in both as an ellipsis. Fixed. – Robert Mastragostino May 25 '12 at 23:57
  • The usual style is $x+y+z+\cdots$, FYI. –  May 26 '12 at 01:07
  • Thanks the for the tip Rahul! – Robert Mastragostino May 26 '12 at 01:57
  • 3
    This (highly upvoted) explanation is seductive but it seems to miss the point: what $E(X^2)=E(X)^2$ would imply is something more like the square of the sum being the sum of the squares times the number of terms (not simply the sum of the squares). Another way to see it is that $E(X^2)=E(X)^2$ is true when $X$ is constant while the square of the sum is not the sum of the squares, even when the arguments are all equal. – Did May 26 '12 at 07:43
  • Robert: Coming from somebody who wants to learn math properly, to leave this answer in disarray is... surprising. – Did Jul 25 '12 at 23:56
  • @did I had completely forgotten about this question. Thanks for the reminder. – Robert Mastragostino Jul 26 '12 at 02:28
7

Let us take for example $X$ the standard normal, or any normal with mean $0$. Then $E(X)=0$.

But $X^2$ is always positive, so clearly its mean must be positive.

This shows that (in this case) $E(X^2)\ne (E(X))^2$.

In fact, when the expectations exist, $E(X^2)>(E(X))^2$ except when $X$ is constant with probability $1$.

André Nicolas
  • 507,029
  • 3
    Even easier to see, maybe, if you make the values $-1$ and $1$. – Robert Israel May 25 '12 at 17:18
  • @RobertIsrael: I had deleted my post, since someone had posted the same example. Then decided to use normal, since that's what OP was working with, and ended up with a variant of your example. – André Nicolas May 25 '12 at 17:22
1

Say you have a fair coin that says $X=1$ on one side and $X=3$ on the other side. You flip the coin. Clearly, $E(X)=\frac12(1+3) = 2$.

If you are counting $X^2$ instead of $X$, then one side of the coin is worth $1^2=1$ and the other side is worth $3^2=9$, so $E(X^2) = \frac12(1+9)=5$.

$5\ne 2^2$.

MJD
  • 65,394
  • 39
  • 298
  • 580
1

My turn:

Let $X$ be uniformly distributed on $[0,1]$. The $E X =\int_{t=0}^1 t dt = \frac{1}{2}$, but $E X^2 =\int_{t=0}^1 t^2 dt = \frac{1}{3}$.

copper.hat
  • 172,524
1

It's a late answer, but it might be useful for anyone reading it.

For a random variable $X$, $E(X^{2})= [E(X)]^{2}$ iff the random variable $X$ is independent of itself. This follows from the property of the expectation value operator that $E(XY)= E(X)E(Y)$ iff $X$ and $Y$ are independent random variables. In the case of a single type of random variable $X$, this statement reduces down to the statement that the random variable is independent of itself.

Statement: Two random variables are independent iff:

$\mathrm{F}_{XY}(x,y)= \mathrm{F}_{x}(x)\mathrm{F}_{y}(y); $ for all $X= x$, $Y=y$

where $\mathrm{F}_{X}(x)$, $X= x$ denotes the cumulative distribution function of an arbitrary random variable $X$. Now as this answer also suggests, a random variable can be independent of itself iff it realizes a fixed (constant) value. This follows from:

$\mathrm{F}_{XX}(x,x)=\mathrm{F}_{X}(x)=\mathrm{F}_{X}(x)\mathrm{F}_{X}(x)$

and $\mathrm{F}_{X}(x)\in$ {$0,1$}, for any $X= x$. The only way for the cumulative distribution function to realize the value $1$ for any $X= x$ is for random variable $X$ to be a constant, i.e., $X= x_{0}$, where $x_{0}$ is the (only) element of the (possibly only) non-empty subset (i.e. a singleton) of the $\sigma$-algebra of the set $E$ (not the expectation value), which is the set of all of values that the random variable $X$ can realize (a more rigorous proof for this last step may require a measure-theoretic approach).

Notice that $E(X^{2})= [E(X)]^{2}$ also implies $E(X^{2})-[E(X)]^{2}=0 \Rightarrow Var(X)=0$, which is always satisfied for a constant random variable $X$. This can also give an intuition for the reasoning above.

Therefore, $E(X^{2})\neq [E(X)]^{2}$ in general.

0

May as well chime in :)

Expectations are linear pretty much by definition, so $E(aX + b) = aE(X) + b$. Also linear is the function $f(x) = ax$. If we take a look at $f(x^2)$, we get

$f(x^2) = a(x^2) \not= (ax)^2 = f(x)^2$.

If $E(X^2) = E(X)^2$, then $E(X)$ could not be linear, which is a contradiction of its definition. So, it's not true :)

0

Assuming $X$ is a discrete random variable $E(X)=\sum x_ip_i$. Therefore $E(X^2)=\sum x_i^2p_i$ while $[E(X)]^2=\left(\sum x_ip_i\right)^2$. Now, as Robert Mastragostino says, this would imply that $(x+y+z+\cdots)^2=x^2+y^2+z^2+\cdots$ which is not true unless $X$ is constant.

E.O.
  • 6,942