In use, the weights obtained are considered with the greatest accuracy when the data set is divided, because the weights are initialized at random and the training data is swapped at random considering the small batch size.
In the literature,
1) Some articles only report the maximum accuracy achieved with the test set.
2) Other papers report the average of the maximum accuracy over the test set over a number of trains.
3) We believe that these approaches are not fair because we do not have an insight into the overall behavior of the dataset.
Question: What is the best way to express accuracy?