0

in "Global Convergence of Block Coordinate Descent in Deep Learning" the authors claims that BCD is gradient-free method but in "Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks" authors calculate the gradient in the BCD algorithm.

0 Answers0