I was performing a binary classification problem with 15000 RGB images using a scratch build CNN model. While it comes to evaluate the model, I can do it in two ways:
1. Splitting data Train and Test and use 10 fold cross-validation for the training data. Later with the best model, I would use the unseen Test data. In this way I got appx. 91.5% avg. accuracy for both test and validation.
2. Just use 10 fold cross-validation and got 92.5% avg accuracy(slightly better result than the previous one.)
Which option would be the best for reporting the performance of my model in the research article?
TIA