I have 62 data with 1 output and 7 input variables. i split the data into 40 data for training/validation and 22 data for testing the neural network. the R square is so small ( negative) and the train data fits well but the test data and the prediction of test data do not performs well so i think the net work is overfitted. I computed cv's for different number of hidden nodes, how can i decide which one is better? the more cv is for better model?