I am currently preparing a paper for a conference, describing a deep learning model for Human Activity Recognition.
The model is tested on several known datasets.
Is a resulting confusion matrix supposed to formed on a single random training/test split, since this information is lacking on most of the papers I have read?
What I am currently doing is a stratified 10-fold cross-validation, for each training/test split reinitializing the model, training it on the training set and calculating a confusion matrix on the test-set and then adding all 10 confusion matrices together. The idea here being that this would result in a general accuracy over all test sets.
I am aware that cross-validation is usually used for finding optimal hyperparameters, but lets say I have found them in this scenario and I'm not changing them.
I would also like to include a plot of the training accuracy/loss for each batch/epoch, which would show the training process, but this would take too much space in the paper because I would have 10 plots.