A proximal gradient algorithm for weight decay training in neural networks - 42Papers