Weight decay combine with conjugate gradient

Asked Sep 26 '13 at 20:22

Active Sep 26 '13 at 20:22

Viewed 74 times

We can use weight decay method for a condition stopping to avoid overfitting when we train a neural network. This method applied with gradient descent learning, bayesian learning, but i want apply it combine with scale conjugate gradient. But i don't know it can enable? And what it effects to update weight?

asked Sep 26 '13 at 20:22

Beginner

if the function to be optimized is (locally) convex enough, then you can use the gradient only as a hint of the direction to which apply the update. for example you can keep only the signs of each gradient's coordinates, or use the conjugate gradient hoping that it will converge faster, or find the optimal $\eta$ at each step, etc. does it mean something to you ? – reuns May 04 '16 at 01:26

Weight decay combine with conjugate gradient

0 Answers0