I am working on a model which is NOT NN.
I have a function train_step which will take the 10% of the beginning of the dataset as a batch to train. But the different method of batching cause different results.
If I just randomly shuffle the training set and let new items be the beginning of the dataset array, it wont converge. However, if I move the last 10% of the data into the front, it converged perfectly.