I am trying to train a Back propagation NN with Bayesian regularisation with input vector has 10 features and single output of continuous value between(-1 to +1)(regression). It is a shallow network with a single hidden layer of 30-50 neurons. Dataset size is 2000. Traing and test ratio is 80:20 as BR does not requires validation. Dataset is randomized and randomly picked 20% is chosen for testing. Now I am curious to know if I should go with 10-fold validation or simply run the existing implementation 10 times and present the average performance? What is the norms in ANN application publications?
Another related question if k-fold is suggested then should I first hold out a training set and then use remaining 80% data for traing and testing/validating? And once test all models separately using out of sample hold out data? As BR does not require validation so if I have to use k-fold then I can only use K-1 set for training and remaining one for testing( not validating as we would do in other learning like LM). Please suggest a way that would minimize rework during revision phase of journal submission.