5

I'm having a difficult time determining if the following function is convex

$$f(X) = \log {\rm det}(X^T A X)$$

where $A \in \mathbb{R}^{r \times r}$ is a symmetric positive definite matrix and $X \in \mathbb{R}^{r \times u}$ with $u < r$ and $X'X = I$. I've worked on find the second derivative with respect to $X$, but am not very confident in my solution. Any suggests on steps to proceed would be appreciated.

EDIT: Because the domain of $X$ is nonconvex, the function cannot be convex, as pointed out by @RobertIsrael.

user23658
  • 453

2 Answers2

2

(EDIT: If $A,X$ are both square, then ...) As the comment suggest, we know that since the determinant is multiplicative and $\det X^T=\det X$, we have that $$\det(X^TAX)=\alpha\det(X)^2$$ where $\alpha=\det(A)>0$.

However, the function $g(x)=x^2$ is not logarithmically convex, therefore the function $f(X)$ is not either.

0

The Hessian can be used to find a sufficient (but not necessary) condition for convexity. According to the author, the first derivative of the objective function is $$ D^1 := \frac{\partial}{\partial X} (\log\det(X^T A X)) = 2(X^T A X)^{-1} AX = 2 B^{-1} A X $$ The second derivative can be computed by going into index notation (summation over repeated indices). Then $$ (D^1)_{in} = 2B^{-1}_{il} A_{l m} X_{mn} $$ Then $$ (D^2)_{inpq} := \frac{\partial (D^1)_{in}}{\partial X_{pq}} = 2\left[\frac{\partial B^{-1}_{il}}{\partial X_{pq}} X_{mn} + B^{-1}_{il} \frac{\partial X_{mn}}{\partial X_{pq}}\right] A_{lm} $$ Now, (without symmetrizing) $$ \frac{\partial X_{mn}}{\partial X_{pq}} = \delta_{mp}\delta_{nq} $$ Therefore, $$ B_{il}^{-1} \frac{\partial X_{mn}}{\partial X_{pq}} A_{lm} = B_{il}^{-1} \delta_{mp}\delta_{nq} A_{lm} = \delta_{nq} B_{il}^{-1} A_{lp} $$ Consider the case where $A = I$. Then $B = X^T A X = X^T X = I$ and $D^1 = 2X$. So $$ (D^2)_{inpq} = 2\delta_{ip}\delta_{nq}\,. $$ That is the simplest result you can get because $X$ is not square. Not sure if this helps in determining convexity.

  • Not sure if you saw the edit above but the derivative you use should have an inverse and be scaled by 2. – user23658 Dec 04 '15 at 04:45