derivative of a symmetric bilinear form (quadratic form version)

Question

Let $A=A^T\in \mathbb R^{k\times k}$ be a nonzero symmetric matrix and define $F:\mathbb R^k\to\mathbb R$ by $$f(x):=x^TAx$$ Then why $df(x)\xi=2x^TA\xi$ for $x,\xi\in\mathbb R^k$?

to begin to see, try the cases $k=1$, $k=2$ – janmarqz Dec 16 '14 at 17:07 — janmarqz, Dec 16 '14 at 17:07
Possibly duplicate, see here (very similar) – alexjo Dec 16 '14 at 17:11 — alexjo, Dec 16 '14 at 17:11

score 5 · Accepted Answer · answered Dec 16 '14 at 17:30

5

It's because

\begin{align}df(x)\xi &= \frac{d}{dt}|_{t = 0} f(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x + t\xi)^T A(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^T + t\xi^T) A(x + t\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^T + t\xi^T)(Ax + tA\xi)\\ &= \frac{d}{dt}|_{t = 0} (x^TAx + t(\xi^T Ax + x^TA\xi) + t^2\xi^TA\xi)\\ &= \xi^T Ax + x^TA\xi\\ &= 2x^TA\xi \end{align}

answered Dec 16 '14 at 17:30

kobe

41,901

1

Because both terms are real scalars; and $a^T=a$ when $a$ is a scalar. – Michael Grant Dec 16 '14 at 17:52
@pxc3110 -- What inequality? – Robin Goodfellow Dec 16 '14 at 19:07
that's supposed to be equality, sorry :) – pxc3110 Dec 16 '14 at 19:13

score 3 · Answer 2 · edited Apr 13 '17 at 12:20

There's a nice answer linked by alexjo using coordinates. Here's an answer without coordinates, using the fact that we know the derivative of a linear map:

Consider $h(x,y) = x^T A y$, with $h: \mathbb{R}^k \times \mathbb{R}^k \rightarrow \mathbb{R}$.

We can restrict $h$ to each factor with $h(x,-) : \{x\} \times \mathbb{R}^k \rightarrow \mathbb{R}$ and $h(-,y): \mathbb{R}^k \times \{y\} \rightarrow \mathbb{R}$.

Then $dh_{x,y} (\xi_x\oplus \xi_y) = dh_{x,y}(\xi_x \oplus 0) + dh_{x,y}(0\oplus \xi_y) = d(h(-,y))_x (\xi_x) + d(h(x,-))_y (\xi_y)$

Because $h(-,y)(x) = x^T A y = y^T A x$ is linear in $x$, we get $d(h(-,y))_x(\xi_x) = y^T A \xi_x$.

Similarly, because $h(x,-)(y) = x^T A y$ is linear in $y$, we get $d(h(x,-))_y(\xi_y) = x^T A \xi_y$.

Thus $dh_{x,y}(\xi_x \oplus \xi_y) = y^T A \xi_x + x^T A \xi_y$.

Finally, we have $f(x) = h(x,x) = h \circ \Delta$ for $\Delta: \mathbb{R}^k \rightarrow \mathbb{R}^k \times \mathbb{R}^k$ given by $\Delta(x) = (x,x)$.

We have $d\Delta_x(\xi) = \xi \oplus \xi$

Thus $df_x(\xi) = (dh_{x,x} \circ d\Delta_x)(\xi) = 2 x^T A \xi$.

derivative of a symmetric bilinear form (quadratic form version)

2 Answers2

Linked