0

Let us assume that we know $\frac{\partial}{\partial A} f$, where $f$ is a scalar function, and $A$ any matrix.

Now suppose we are interested in the special case when $A$ is diagonal, and we want to know what's $$\frac{\partial}{\partial d_A} f$$ where $d_A$ is matrix $A$'s diagonal.

Also, what would be $\frac{\partial}{\partial d_A} A$?

  • 1
    It's totally unclear what you have in mind. Give a more detailed description of the envisaged mathematical situation. – Christian Blatter Nov 17 '18 at 14:55
  • you cannot define the differential for just diagonal matrices, if not for the variable of $f$. In any case a matrix is isomorphic to a vector space so you can apply the multivariable derivative – Masacroso Nov 17 '18 at 15:00

1 Answers1

3

You have calculated the gradient of the function $f(A)$ $$G=\frac{\partial f}{\partial A}$$ with no constraints on the matrix variable, and now you wish to constrain $A$ to be diagonal, i.e. $$A={\rm Diag}(a)$$ Start with a differential in terms of $dA$, then change the variable to $da$ $$\eqalign{ df &= G:dA \cr &= G:{\rm Diag}(da) \cr &= {\rm diag}(G):da \cr \frac{\partial f}{\partial a} &= {\rm diag}(G) \cr &= {\rm diag}\Big(\frac{\partial f}{\partial A}\Big) \cr }$$ where
$\,\,\,:\,\,$ is a product notation for the trace $\,\,A:B={\rm Tr}(A^TB)$
$\,\,\,{\rm Diag}()$ generates a diagonal matrix from the input vector
${\,\,\,\rm diag}()$ extracts the diagonal of a matrix into an output vector

lynn
  • 3,396
  • Lynn, thanks for the answer. The result it's what I expected it to be, but I'm not sure the equality $G:Diag(da)=diag(G):da$? is Diag ->diagonal matrix generated from a vector and diag ->main diagonal of matrix ? ? – An old man in the sea. Nov 17 '18 at 17:08
  • Decompose the gradient into a sum of its diagonal and off-diagonal parts. Only the diagonal part will contribute to the above trace product; the product with the off-diagonal part is zero. – lynn Nov 18 '18 at 18:07
  • Lynn, many thanks ;) – An old man in the sea. Nov 18 '18 at 18:24
  • lynn, if we had $A=Diag(e^a_i)$, then what would $\frac{\partial}{\partial a} f$ be ? where $a=(a_1,...,a_n)$. I get $\frac{\partial}{\partial a} f = diag(\frac{\partial}{\partial A} f ^\intercal A)$ – An old man in the sea. Jan 24 '19 at 09:33
  • Assuming you mean $$\eqalign{ x & = \exp(a) &\implies dx = x\odot da\cr A &= {\rm Diag}(x) \cr }$$ Then the calculation runs as follows $$\eqalign{ df &= G:dA \cr &= {\rm diag}(G):dx \cr &= {\rm diag}(G):x\odot da \cr &= x\odot {\rm diag}(G):da \cr \frac{\partial f}{\partial a} &= x\odot {\rm diag}(G) \cr }$$ where the exp function is applied element-wise and $\odot$ denotes the Hadamard product. – lynn Jan 28 '19 at 22:00
  • Lynn, in my comment above I meant $A=Diag(e^{a_i})$. I'm not exactly sure I understand what $\exp(a)$ is in your comment... Could you, in your answer, elaborate a bit on this? I'll give you a +1;) – An old man in the sea. Jan 28 '19 at 22:15