7

I'm thinking that the second directional derivative, if both dd's are evaluated in the same direction, will just give you the concavity (the second scalar derivative) in that direction. Is that right?

But what if the second directional derivative is evaluated in a different direction? As in $D_{\vec v}D_{\vec u} f(\vec x)$ where $\vec v\ne \vec u$. Then what would this thing mean? Does it still have to do with concavity?

  • 1
    If you want to reduce to the scalar setting, I think you are basically stuck with some linear algebra. Specifically, for ease of visualization, consider a point where the gradient is zero. You have a matrix of the 4 second partial derivatives, called the Hessian. Along the lines given by the eigenvectors of this matrix, provided the corresponding eigenvalue $\lambda$ isn't zero, the function looks like the parabola $\lambda x^2$. – Ian May 21 '16 at 02:00
  • Geometrically, when the mixed partials are zero, these lines are exactly along the coordinate axes; when the unmixed partials are zero, these lines are exactly the lines $y=x$ and $y=-x$. In between the lines are in between these two cases (but one can prove with linear algebra that they are always perpendicular). – Ian May 21 '16 at 02:00
  • note that in $\mathbb{R}^2$ : the $\frac{\partial^2 }{\partial x \partial y}$ operator is of that type. and it is useful for finding the Taylor expansion of $f$. without, you cannot say what will look like $f(\vec{x}+h \vec{v}) - f(\vec{x}) - h \vec{\Delta} f . \vec{v}$ which is $= h^2 \vec{v}^T H_{f}(\vec{x}) \vec{v} + o(h^2)$ where $H_{f}(\vec{x})$ is the Hessian matrix $H_{f}(\vec{x})_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}(\vec{x})$ i.e. the matrix of the second order directionnal derivatives – reuns May 21 '16 at 02:00
  • @Ian So would it represent the concavity in the direction $\vec u+\vec v$? – user341126 May 21 '16 at 02:03
  • 1
    @user341126 The concavity in a given direction $d$ ($d$ is some unit vector) is $d^T H d$ where $H$ is the Hessian. – Ian May 21 '16 at 02:06
  • yes that's what I meant with my comment – reuns May 21 '16 at 02:07
  • 1
    Well I know that $v^T(Hf)v = D_vD_vf$ (when $Hf$ exists) so then I guess the second directional derivative only represents concavity when both are taken in the same direction? Then I still wouldn't know how to interpret (geometrically or even analytically) what $D_vD_u f(x)$ is. – user341126 May 21 '16 at 02:09
  • Is $D_u D_v f$ perhaps $u^T H v$? In that case there is still an interpretation but it is a bit subtle (something to do with angles). – Ian May 21 '16 at 02:10
  • Yes.$\ \ \ \ \ $ – user341126 May 21 '16 at 02:10
  • @Ian : the interpretation is by writing $D_u D_v f$ as the difference of values of $f$ on an infinitesimal parallelogram ? I don't see anything else – reuns May 21 '16 at 02:14
  • and this parallelogram is important : it tells us the torsion and the curvature, how it becomes https://en.wikipedia.org/wiki/Torsion_tensor and https://en.wikipedia.org/wiki/Riemann_curvature_tensor – reuns May 21 '16 at 02:16
  • I tend to think of this as "How fast is $D_v f$ changing if I am moving in the direction of $u$?" – Alex Pavellas May 24 '16 at 16:47

1 Answers1

2

The answer to this question rests in linear algebra. If $f$ is twice continuously differentiable (just to avoid pathological things) then we have that partial derivatives of $f$ commute. I'll skip the proof, but, the same goes for the directional derivatives of $f$. We have $D_u(D_v f) = D_v(D_u(f))$. Furthermore, if $u,v$ are linearly independent then they form a basis for $\mathbb{R}^2$ and we can study the behavior of the quadratic term in the multivariate Taylor expansion in terms of the theory of quadratic forms. In particular, $$ Q(h,k) = [h,k]\left[ \begin{array}{cc} D_uD_uf & D_uD_v f \\ D_uD_v f & D_vD_v f \end{array} \right]\left[ \begin{array}{c}h \\ k \end{array} \right] = \lambda_1\bar{x}^2+\lambda_2\bar{y}^2$$ where $\bar{x}, \bar{y}$ are eigencoordinates and $\lambda_1, \lambda_2$ are the eigenvalues of $A = \left[ \begin{array}{cc} D_uD_uf & D_uD_v f \\ D_uD_v f & D_vD_v f \end{array} \right]$. Using linear algebra and the specific form of $A$, $$ \text{trace}(A) = \lambda_1+\lambda_2 = D_uD_uf+D_vD_vf $$ and $$ \text{det}(A) = \lambda_1\lambda_2 = (D_uD_uf)(D_uD_uf)-(D_uD_vf)^2$$ If $\nabla f =0$ at a point then $f(p+(h,k)) = f(p) + Q(h,k)+ \cdots $ which means that the nature of $z = f(x,y)$ near $p$ is totally governed by $Q$. It's easy to see

  1. If $\lambda_1\lambda_2 <0$ then $f$ is both increasing and decreasing near $p$ hence we face a saddle point. Specifically, in terms of directional derivatives, $(D_uD_uf)(D_uD_uf)-(D_uD_vf)^2 < 0$ provides $\text{det}(A)<0$ and hence the eigenvalues differ in sign.
  2. If $\lambda_1, \lambda_2>0$ then $f$ increases as we travel away from $p$ this is manifest from the formula $\lambda_1\bar{x}^2+\lambda_2\bar{y}^2$ for $Q$. This requires $(D_uD_uf)(D_uD_uf)-(D_uD_vf)^2 > 0$ and $\text{trace}(A) >0$, but, if you think about it, given $\text{det}(A)>0$ we have $D_uD_u f>0$ implies $D_vD_v f>0$. In short, if $(D_uD_uf)(D_uD_uf)-(D_uD_vf)^2 > 0$ and $D_uD_u f>0$ then $f$ has a local minimum at $p$.
  3. If $\lambda_1, \lambda_2<0$ then $f$ decreases as we travel away from $p$ this is manifest from the formula $\lambda_1\bar{x}^2+\lambda_2\bar{y}^2$ for $Q$. This requires $(D_uD_uf)(D_uD_uf)-(D_uD_vf)^2 > 0$ and $\text{trace}(A) < 0$, but, if you think about it, given $\text{det}(A)>0$ we have $D_uD_u f<0$ implies $D_vD_v f<0$. In short, if $(D_uD_uf)(D_uD_uf)-(D_uD_vf)^2 > 0$ and $D_uD_u f<0$ then $f$ has a local maximum at $p$.

At this point, you may complain, I meant to study $f: \mathbb{R}^n \rightarrow \mathbb{R}$. What is the significance of mixed directional derivatives in that context for $n>3$. It's more complicated, to give the complete picture, I need $n$-LI directions and all $n(n+1)/2$-independent mixed directional derivatives. If I have all that data then I can construct the $n \times n$ analog of $A$ and again characterize the nature of the graph $x_{n+1} = f(x_1, \dots , x_n)$ at a critical point in terms of the eigenvalues of $A$ (assuming $A \neq 0$, in the case $A=0$, we'd need higher derivative data...)

Another way to look at my answer is this: the mixed directional derivatives also have to do with concavity. The eigenvalues tell us the concavity of the sections of the graph in the direction of eigenvectors. If the point is critical, then that concavity reveals the extremal nature of the point.

James S. Cook
  • 16,755