If an activation function has a jump discontinuity, then in the training process, can we implement backpropagation to compute the derivatives and update the parameters?

More Jianfeng Ning's questions See All
Similar questions and discussions