1

I'm hoping someone can help me compute

$$\nabla_{X} \mathsf{tr}\left( f \left( X \right) Y \right)$$

where $f : \mathbb{R}^{u \times v} \to \mathbb{R}^{m \times n}$ and $Y \in \mathbb{R}^{n \times m}$. I'd like to get an expression in terms of $\nabla_X f \left( X \right)$.

What I've worked out so far: for any $1 \leq i \leq u$ and $1 \leq j \leq v$ we have

\begin{align} \frac{\partial}{\partial X_{ij}} \mathsf{tr}\left( f(X) Y \right) &= \frac{\partial}{\partial X_{ij}} \sum_{k=1}^m \left( f(X) Y \right)_{kk} \\ &= \sum_{k=1}^m \frac{\partial}{\partial X_{ij}} \left( f(X) Y \right)_{kk} \\ &= \sum_{k=1}^m \frac{\partial}{\partial X_{ij}} \sum_{p=1}^n f(X)_{kp} Y_{pk} \\ &= \sum_{k=1}^m \sum_{p=1}^n \frac{\partial}{\partial X_{ij}} f(X)_{kp} Y_{pk} \end{align}

but I would like to relate this expression to $\nabla_X f \left( X \right)$ if possible.

Thanks!

1 Answers1

2

Trace is linear, so calculating any directional derivative, $\nabla_v \text{tr}(f(X)Y) = \text{tr}\big((\nabla_v f(X))Y\big)$.

Ted Shifrin
  • 115,160
  • I would expect $\nabla_X \mathsf{tr}\left( f \left( X \right) Y \right)$ to be a matrix of dimension $u \times v$ but $\mathsf{tr}\left( \left( \nabla_X f \left( X \right) \right) Y\right)$ is a scalar. I think my question actually doesn't make much sense... I couldn't tell you what dimensions $\nabla_X f \left( X \right)$ should have. – msantama Mar 19 '24 at 00:41
  • Well, the derivative will be a linear map from $\Bbb R^{u\times v}$ to $\Bbb R^{m\times n}$. You don't really want to understand this linear map as a giant matrix. – Ted Shifrin Mar 19 '24 at 00:55