1

Given $F,f:\mathbb R\to\mathbb R$ such that $F'=f$ and $\pmb a,\pmb b\in\mathbb R^n$, compute $$\frac{d}{d\pmb X}\left(\pmb a^T F\left(\pmb X\right)\pmb b\right)$$ where $\pmb X\in\mathbb R^{n\times n}$.


My guess is $\operatorname{diag}(\pmb b) f(\pmb X)^T\operatorname{diag}(\pmb a)$ but I would like to
(i) confirm it and
(ii) see if there's a better way than using indices.

  • $\def\D{\operatorname{Diag}}$You're close, just drop the transpose $;\D(a)f(X)\D(b)\quad$ This assumes that the functions are applied elementwise. – greg Feb 03 '21 at 22:49
  • Sorry I had $\operatorname{diag}(\pmb a)$ and $\operatorname{diag}(\pmb b)$ swapped. What do you think now? I'm following the definition found here: https://tminka.github.io/papers/matrix/minka-matrix.pdf – foreignvol Feb 03 '21 at 23:56
  • 1
    The swapped version is consistent, however the transposition of the entire formula is something called the layout convention which varies from textbook to textbook. I happen to prefer the opposite convention. – greg Feb 04 '21 at 02:13

1 Answers1

0

First, note that $a^TF(X)b=\operatorname{tr}(a^TF(X)b)=\operatorname{tr}(ba^TF(X))$. So using differentials we have

$$d\operatorname{tr}(ba^TF(X))=\operatorname{tr}(ba^TdF(X))=\operatorname{tr}(ba^Tf(X)\odot dX)=ab^T\cdot f(X)\odot dX=$$ $$=ab^T\odot f(X)\cdot dX$$ As $\operatorname{tr}(A^TB)=A\cdot B$, where $A\cdot B=\sum_{i,j}A_{i,j}B_{i,j}$ for matrices of the same dimensions. $\odot$ - Hadamard product.

From this we see that $$\frac{d}{dX}\left(a^T F\left(X\right)b\right)=ab^T\odot f(X)$$ Also note $ab^T\odot f(X)=\operatorname{diag}(a)f(X)\operatorname{diag}(b)$.

I have used notations and properties of operations from this paper.

Koncopd
  • 1,000