While reviewing technical articles, I encounter some papers where authors have used data balancing before splitting test and train (on full data), and some only balanced training data and let the test data genuine. I feel like testing data must be original. What is your opinion?