0

This questions is from the Kuhn-Tucker paper "Nonlinear Programming" in Section 2 Lemma 1. I don't understand how those conditions are necessary for a saddle point. I always thought that a saddle point was defined where the gradient is zero and the second derivative characterizes the shape of the saddle point.

I don't understand how the gradient could ever be non-zero. And assuming it could be non-zero why the gradient must be positive in one variable and negative in the other.

After this, everything else makes sense. Thanks in advance. A reference to something else that explains this would suffice, but I can't find one.

  • What your looking for is an understanding of he Morse Lemma. It states, in two dimensions, that if we have a point with zero gradient and nonzero Hessian, that the function is asymmtopically of the form $\pm x^2\pm y^2$. We have also, that the Hessian being zero implies that we can write it in the form $x^2-y^2$, which is what you want. A good refrence is Milnor's Book on Morse theory. – Pax Apr 13 '15 at 21:12
  • I don't see how Morse Lemma connects here. Could you explain a bit more please? I understood the conditions in the paper to state that the gradient of the Lagrangian can be non-zero at certain optima granted that the inner product of the gradient and the point vector is zero. Also, as far as I understood in the paper, the Hessian wasn't mentioned. –  Apr 19 '15 at 05:11

1 Answers1

0

(This should be a comment, but got a bit long.)

The key is that the "saddle point" is not what you think it is. Note that right before Lemma 1 they defined their notion of a "Saddle Point problem" which in particular searches for a "saddle point" on a manifold with boundary given by the set $\{x \geq 0, u\geq0\}$. The same way that in a optimization problem that a maximizer or a minimizer may occur not in the interior, and instead on the boundary, a saddle point should also be allowed to occur on the boundary of this set. So they have a very specific generalized definition for a saddle point, which happens to coincide with the classical intuition in the interior of the set.

So in particular, the gradient in Lemma 1 can be nonzero if the solution of the saddle point problem happens on the boundary.

To get some intuition you can imagine the classical saddle $x^2 - y^2$, and looking at where the solution to the saddle point problem (as defined by the paper) lies over the sets of the form $\{x \geq x', y \geq y'\}$ where $x'$ and $y'$ are fixed (but arbitrary) numbers.

Willie Wong
  • 73,139
  • Okay, I believe I get that. Does this explain why the proof claims that the gradient must be zero except possibly where the corresponding dimension of $ x $ goes to zero? My intuition tells me that the proof is trying to say that where $ x_i = 0 $, we are allowed to have a gradient term $ \delta x_i \ne 0 $. Then by the constraint, we know that $ x_i = 0 $ and the inner product thus remains zero. And the sign of the gradient term is given by the fact that we must not be able to optimize by going away from the boundary? Am I correct? –  Apr 16 '15 at 00:34
  • Roughly speaking, it sounds like you got it. Some of your sentences in the middle of your comment are too brief for me to make a full assessment, but it does sounds like you are at the very least on the right track. – Willie Wong Apr 16 '15 at 07:51
  • I finally got it. Wow. That is pretty easy. I completely forgot about the set of possible x, u. –  Aug 13 '15 at 19:33