We know that we need to split our data in training and testing subsets, so we can have have the estimation performance of our model on new data. We do this when training a model that have to "learn" like neural networks for example. But do we need to split our data when performing simple statistical comparisons? For example, in many clinical papers they plot the ROC curve based on only one feature and they report the classification accuracy according the ROC which was build using the complete dataset, is this correct?