0

I referred to the following definition for gradient in matrices (so related to Frobenius norm in matrices) Gradient and Hessian of a function with Matrix Variables.

Suppose I have $$g(X) = A(X-X_0)$$ where

  1. $A, \, X, \, X_0 \in \mathbb{R}^{n\times n}$
  2. $X$ is the variable

If $g(X)$ admits a scalar anti-derivative function, i.e., $\nabla_Xf(X)=g(X)$, where $f(X)\in \mathbb{R}$.

My question is what $f(X)$ should look like? (note: $A$ is a matrix, not a scalar)

If my question does not make sense, then why it does not make sense? Thanks!

Denny
  • 683
  • 1
    I suggest guessing, based on the answer when $X$ is a $1$-by-$1$ matrix, i.e., a scalar. More generally, use the fact that the gradient is the matrix-valued function such that the directional derivative of $f$ is the "dot product" of the gradient of $f$ with the direction. – Deane Jan 18 '21 at 17:08

1 Answers1

0

A sufficient conditions is to be able to write $A=B^T B$ then yes. Recall that $\nabla \frac{1}{2}\|B(x-x_0)\|^2=B^TB(x-x_0)$.

  • In my question, $X$ is a matrix. So the norm would be the Frobenius norm $|X|_F = (\langle X,X \rangle)^{1/2} = (tr(X^TX))^{1/2}$. If so, then the gradient may not be that answer. If you do not consider the Frobenius norm, then which norm do you pick? how to define the gradient for matrices from that norm? (I use the definition from the link in my question.) – Denny Jan 18 '21 at 16:19
  • It's the same norm. What it changes is the chain rule with the linear operator, I have to think more about it to fix this part. But the norm is certainly the same, it's still just the square root of the sum of squares of the elements. – Jürgen Sukumaran Jan 18 '21 at 16:36