Having some trouble understanding this proof in certain steps, even after trying to consult the matrix cookbook.
For two multivariate Gaussians $P_1, P_2 \in R^n$:
$KLD(P_1 || P_2) = E_{P_1}[\log P_1 - \log P_2]$
$= \frac{1}{2} E_{P_1}[-\log \det\Sigma_1 - (x - \mu _1)^T\Sigma_{1}^{-1}(x - \mu_1) + \log\det\Sigma_2 + (x - \mu _2)^T\Sigma_{2}^{-1}(x - \mu_2)]$
$= \frac{1}{2}\log \frac{\det\Sigma_2}{\det\Sigma_1} + \frac{1}{2}E_{P_1}[- (x - \mu _1)^T\Sigma_{1}^{-1}(x - \mu_1) + (x - \mu _2)^T\Sigma_{2}^{-1}(x - \mu_2)]$
$= \frac{1}{2}\log \frac{\det\Sigma_2}{\det\Sigma_1} + \frac{1}{2}E_{P_1}[tr (\Sigma_{1}^{-1}(x - \mu_1)(x - \mu _1)^T) + tr(\Sigma_{2}^{-1}(x - \mu_2)(x - \mu _2)^T)]$
$= \frac{1}{2}\log \frac{\det\Sigma_2}{\det\Sigma_1} + \frac{1}{2}E_{P_1}[tr (\Sigma_{1}^{-1}\Sigma_{1}) + tr(\Sigma_{2}^{-1}(xx^T - 2x\mu^{T}_{2} + \mu_2\mu_{2}^T)]$
Why does $(x-\mu)(x-\mu) = \Sigma_1$?
$= \frac{1}{2}\log \frac{\det\Sigma_2}{\det\Sigma_1} + \frac{1}{2}n + \frac{1}{2} tr(\Sigma_{2}^{-1}(\Sigma_1 + \mu_1\mu_{1}^T - 2\mu_2\mu^{T}_{1} + \mu_2\mu_{2}^T)]$
What rule gets rid of the EV?
$= \frac{1}{2}(\log \frac{\det\Sigma_2}{\det\Sigma_1} - n + tr(\Sigma_{2}^{-1}(\Sigma_1) + tr(\mu_{1}^T\Sigma_{2}^{-1}\mu_1 - 2\mu_{1}^T\Sigma_{2}^{-1}\mu_2 + \mu_{2}^T\Sigma_{2}^{-1}\mu_2)$
$= \frac{1}{2}(\log \frac{\det\Sigma_2}{\det\Sigma_1} - n + tr(\Sigma_{2}^{-1}(\Sigma_1) + (\mu_{2}-\mu_1)^T\Sigma_{2}^{-1}(\mu_{2}-\mu_1))$
How do you reduce (what is the rule) that last term?
Thanks