Minimum Mean Square Error Estimate Example

Question

We have data from 2D normal (gaussian) distribution. $$\begin{bmatrix}y\\x\end{bmatrix}\,\text{~}\,\mathcal{N}\left(\begin{bmatrix}2\\4\end{bmatrix},\begin{bmatrix}10&2\\2&20\end{bmatrix}\right)$$ where $\mathcal{N}(\mu,P)$ and $\mu$ is mean and $P$ is covariance matrix in form $$P=\begin{bmatrix}P_{yy}&P_{yx}\\P_{xy}&P_{xx}\end{bmatrix}$$ $x$ is the unknown and $y$ is the observation.

We have an observation $\textbf{y=1}$.

Compute mean square estimate $\hat{x}_{MS}$.

Compute covariance of the error $P_{\hat{x}_{MS}}$.

This is what my notes says. I can't find anywhere in the notes nor in the lectures any hint, how to compute this, so I'm not sure, if it makes sense. If it does, can you show me how to compute this?

score 1 · Answer 1 · answered Nov 01 '17 at 15:30

We have a prior $P(\vec{v}) \propto e^{-1/2(\vec{v} - \vec{\mu})^T P^{-1} (\vec{v} - \vec{\mu})}$. We observe that $v_1 = 1$, and we are asked for (moments of) the posterior distribution of $v_2$. So plug $v_1 = 1$ into the above equation. $P^{-1} = \begin{bmatrix} 20/14 & -2/14 \\ -2/14 & 10/14 \end{bmatrix}, \begin{bmatrix} 1 & x \end{bmatrix} P^{-1} \begin{bmatrix} 1 \\ x \end{bmatrix} = 20/14 - 4x/14 + 10x^2/14 = 10/14*(x-1/5)^2 + C$. $P(x | y=1) \propto e^{-5/14 (x-1/5)^2}$

So $E[x] = 1/5, Var[x] = 14/10 = 7/5$.

The approach you take is correct. However, you make several mistakes. First of all, the inverse of the covariance matrix is incorrect. Next, although you mention the mean $\mu$, you do not apply it in your formula, so you end up with the wrong results. — EdG, Nov 03 '17 at 04:00

EdG · Accepted Answer · 2017-11-03T07:08:47.713

1

This problem is described extensively in literature. One way to go would be by using the book "Pattern Recognition and Machine Learning" from Bishop, 2006. Equations 2.94 till 2.98 will do the trick for you.

So let the mean be noted by $$ \begin{bmatrix} \mu_y \\ \mu_x \end{bmatrix}= \begin{bmatrix} 2 \\ 4 \end{bmatrix}. $$

The inverse of the covariance is: $$ \Lambda=P^{-1}=\begin{bmatrix} \Lambda_{yy} & \Lambda_{yx} \\ \Lambda_{xy} & \Lambda_{xx} \end{bmatrix} = \frac{1}{196}\begin{bmatrix} 20 & -2 \\ -2 & 10 \end{bmatrix} $$

Then, according to Eq. (2.96) of the aforementioned book, we have $$ p(x|y)=\mathcal{N}(x | \mu_{x|y}, \Lambda_{xx}^{-1}) $$ with \begin{align} \mu_{x|y} &= \mu_x - \Lambda_{xx}^{-1} \Lambda_{xy} (y - \mu_y) \\ &= 4 - \frac{196}{10}\cdot \frac{-2}{196}\cdot(1-2) \\ &=\frac{19}{5} \end{align} and $$ \Lambda_{xx}^{-1}=\frac{196}{10}=\frac{98}{5}. $$ So we have \begin{align} \hat{x}_{MS} &= \frac{19}{5}, \\ P_{\hat{x}_{MS}} &= \frac{98}{5}. \end{align}

edited Nov 03 '17 at 07:08

answered Nov 03 '17 at 03:43

EdG

1,596

The formula you used is in my notes labeld as LMS (linear) Estimate. I understand it is the formula developed precisely for the data from normal distribution, but is there absolutely no difference between linear mean square error and mean square error for data with normal distribution? – user50222 Nov 04 '17 at 11:58
There is a difference! With linear Mean Square Estimate (MSE), you assume a linear relationship between the expectation ($\hat{x}_{MS}$ in this case) and the measurement ($y$ in this case). With MSE, one does not make that assumption. This, however, can result in minimization problems for which no analytical solution exists (so numerical methods are used). Fortunately, for your problem, the MSE would result in a linear expression, thus the result would be the same as for LMSE. – EdG Nov 04 '17 at 13:50
If you want to check this, then plot a pdf (e.g. in Matlab) with on the x-axis the value of x and on the y-axis the probability density given the 2D gaussian distribution (with $y=2$). You will see that is corresponds to a normal distribution with the mean and covariance that are derived above. – EdG Nov 04 '17 at 13:52

EdG · Answer 3 · 2017-11-03T13:24:19.967

Another way - perhaps more intuitive - to look at this problem, is as follows. Suppose we can describe the stochastic variables $x$ and $y$ as follows: \begin{equation} \begin{bmatrix} y \\ x \end{bmatrix} = \begin{bmatrix} \mu_y \\ \mu_x \end{bmatrix} + \begin{bmatrix} p_{yy} & 0 \\ p_{xy} & p_{xx} \end{bmatrix} \begin{bmatrix} \epsilon_y \\ \epsilon_x \end{bmatrix}, \end{equation} where $\epsilon_y$ and $\epsilon_x$ are stochastic variables that come from a normal distribution with mean $0$ and variance $1$, i.e. $\epsilon_y, \epsilon_x \sim \mathcal{N}(0,1)$. When computing the expectation of the above formula, it is easy to see that $\mu_y=2$ and $\mu_x=4$. Furthermore, when computing the covariance, we see that \begin{equation} \begin{bmatrix} p_{yy} & 0 \\ p_{xy} & p_{xx} \end{bmatrix} \begin{bmatrix} p_{yy} & p_{xy} \\ 0 & p_{xx} \end{bmatrix} = \begin{bmatrix} p_{yy}^2 & p_{yy}p_{xy} \\ p_{yy}p_{xy} & p_{xy}^2 + p_{xx}^2 \end{bmatrix} = \begin{bmatrix} 10 & 2 \\ 2 & 20 \end{bmatrix} \tag{1} \end{equation}

Now we can rewrite $x$, by using the fact that $\epsilon_y = \frac{y - \mu_y}{p_{yy}}$: \begin{align} x &= \mu_x + p_{xy}\epsilon_y + p_{xx}\epsilon_x \\ &=\mu_x + \frac{p_{xy}}{p_{yy}} (y - \mu_y) + p_{xx} \epsilon_2 \end{align} Now, given that $y$ is known, it is easy to see that the expected value of $x$ (you call is $\hat{x}_{MS}$) is \begin{equation} E[x] = \mu_x + \frac{p_{xy}}{p_{yy}}(y-\mu_y) \tag{2} \end{equation} and the covariance of the $x$ (you call is $P_{\hat{x}_{MS}}$) is simply $p_{xx}^2$. From Eq. $(1)$, it follows that \begin{equation} p_{yy}p_{xy}=2, \end{equation} so \begin{equation} \frac{p_{xy}}{p_{yy}}=\frac{2}{p_{yy}^2}=\frac{2}{10}=\frac{1}{5}. \end{equation} Filling this in Eq. $(2)$ gives $$ \hat{x}_{MS} = 4 + \frac{1}{5}(1-2)=4-\frac{1}{5}=\frac{19}{5}. $$ Furthermore, we have $$ P_{\hat{x}_{MS}} = p_{xx}^2 = 20 - p_{xy}^2 = 20 - p_{yy}^2 \left(\frac{p_{xy}}{p_{yy}}\right)^2=20-\frac{10}{25}=\frac{98}{5}, $$ which corresponds to the other answer I gave :).

Minimum Mean Square Error Estimate Example

3 Answers3