1

I am wondering if its possible to obtain an analytic expression for the gradient of $$f(B) = (A - B)\left[(A - B)'(A - B)\right]^{-\frac{1}{2}}$$ with respect to $B$ where $A \in \mathbb{R}^{s \times t}$, $B \in \mathbb{R}^{s \times t},$ and $(A - B)$ has rank $t$ and $t$ distinct singular values.

Trying to vectorize the expression (since gradient of vector with respect to vector has analytic form): $$ {\rm vec}\left(A\left[(A - B)'(A - B)\right]^{-\frac{1}{2}}\right) - \left(\left[(A - B)'(A - B)\right]^{-\frac{1}{2}} \otimes I\right) {\rm vec}(B),$$ but since $B$ cannot be vectorized inside the square root, I'm not sure how (or if its possible to) proceed.

Any insight would be appreciated!

user23658
  • 453
  • What exactly do you mean by $C^{-\frac12}$? Do you mean "The matrix $D$ such that $D^2C=CD^2=I$"? Because $D$ is not at all uniquely defined in that case. – Michael Hartley Jan 29 '19 at 00:57
  • Good point! Yes, your definition is what I mean -- but I should have clarified that the singular values of $(A - B)$ are unique. This, along with my earlier statement that $(A - B)$ is rank $t$, should address any issues of uniqueness. – user23658 Jan 29 '19 at 01:05
  • whether the singular values are unique or not doesn't really help: For example, if $C$ is a diagonal matrix with diagonal $1,4,9,16,25,...$, then D could be any of the diagonal matrices with diagonals $\pm1,\pm\frac{1}{2},\pm\frac{1}{3},\pm\frac{1}{4},\pm\frac{1}{5},...$. Each eigenvalue (repeated or not) of $C$ gives you a choice of sign for the eigenvalue of $D$. Unless - do you know that $A-B$ is positive definite, and you're only choosing positive square roots of the positive eigenvalues of $A-B$ to form $(A-B)^{-\frac{1}{2}}$? – Michael Hartley Jan 29 '19 at 01:16
  • 1
    $A - B$ is not even square, so it couldn't be positive definite. The matrix I am taking the square root of is $(A-B)'(A-B)$, which is $t \times t$ and positive definite since $A - B$ has rank $t$. – user23658 Jan 29 '19 at 03:39
  • I may be missing something though, so I appreciate the questions! – user23658 Jan 29 '19 at 03:46

1 Answers1

3

Define a new variable $$\eqalign{ X &= B-A \quad\implies dX = dB \cr }$$ Write the function in terms of this new matrix. $$\eqalign{ F &= -X(X^TX)^{-1/2} \cr }$$ Multiply each side by its transpose and calculate their differentials. $$\eqalign{ FF^T &= X(X^TX)^{-1}X^T = XX^+ \cr dF\,F^T+F\,dF^T &= dX\,X^+ + X\,dX^+ \cr &= dX\,X^+ + X^{+T}dX^T(I-XX^+) - XX^+dX\,X^+ \cr &= (I-XX^+)\,dX\,X^+ + X^{+T}dX^T(I-XX^+) \cr &= P\,\,dX\,X^+ + X^{+T}dX^TP \cr }$$ where $X^+$ denotes the pseudoinverse, and $P$ is an orthogonal projector into the nullspace of $X$.

Vectorize and solve for the gradient. $$\eqalign{ \Big((F\otimes I) + (I\otimes F)K\Big)\,{\rm vec}(dF) &= \Big((X^{+T}\otimes P) + (P\otimes X^{+T})K\Big)\,{\rm vec}(dB) \cr M\,df &= N\,db \cr \frac{\partial f}{\partial b} &= M^+N \cr }$$ where $K$ is the commutation matrix associated with the Kronecker product.

greg
  • 35,825