4

I am trying to make sense of this paper qwone.com/~jason/writing/convexLR.pdf

"Regularized Logistic Regression is Strictly Convex" by Jason D. M. Rennie.

I am following the proof and formula (1) is a given:

$$ -\ln(P(\vec{y}\mid X,\vec{w})) = \sum_{i=1}^N \ln(1+e^{(-y_{i}\vec{w}^T\vec{x}_i)} $$

Assuming:

$$ g(z) = \frac{1}{1+e^{-z}} $$

I also see how

$$ 1-g(z) = \frac{e^{-z}}{1+e^{-z}} $$

However, I don't follow how

$$ \frac{\partial g(z)}{\partial z} = -g(z)(1-g(z)) $$

If I differentiate g(z) w.r.t. z I get:

$$ \frac{\partial g(z)}{\partial z} = \frac{e^{-z}}{(1+e^{-z})^2} $$

which is $g(z)(1-g(z))$ not $-g(z)(1-g(z))$

Also, when doing (2) I get the negative of what is expressed there (taking into account it is performing the partial differential of - L.H.S. of (1)):

$$ \frac{\partial (-\text{L.H.S. (1)} )}{\partial w_j} $$

Thanks in advance!

  • your calculations are correct, not the paper you were reading. – Math-fun Feb 21 '15 at 17:22
  • 1
    Someone else can check? I have seen other successful proofs that Logistic Regression optimization is a convex problem, I'm just wondering if there is anything I'm not seeing... – user1064285 Feb 22 '15 at 14:29

1 Answers1

1

Here is the graph of $\displaystyle g(z)=\frac{1}{1+e^{-z}}$:

enter image description here

which is clearly increasing, hence the derivative should be positive and as your calculations show $g'(z)=g(z)(1-g(z))>0$. This could be a typo in the paper you mentioned. Further equation (2) in the "paper" is correct since it is the derivative of $\displaystyle\log P(\cdot)$ not $\displaystyle-\log P(\cdot)$. The second derivative in the paper is also correct.

Math-fun
  • 9,507