This questions is from the Kuhn-Tucker paper "Nonlinear Programming" in Section 2 Lemma 1. I don't understand how those conditions are necessary for a saddle point. I always thought that a saddle point was defined where the gradient is zero and the second derivative characterizes the shape of the saddle point.
I don't understand how the gradient could ever be non-zero. And assuming it could be non-zero why the gradient must be positive in one variable and negative in the other.
After this, everything else makes sense. Thanks in advance. A reference to something else that explains this would suffice, but I can't find one.