Recently, in one of my works I have incorporated the Cyclical Learning rate (Smith, 2017) policy while training a shallow neural network. It's a regular three layer network.
The dataset is this one: http://archive.ics.uci.edu/ml/datasets/Phishing+Websites
I have attached the snap of the model architecture.
I am able to improve the training time with similar accuracy (accuracy that I was getting earlier without using CLR) with this policy. But the accuracy is not much viz. 53%.
Any help in this?