Many machine learning studies do not have separate test/validation sets and only report cross-validation results which as far as I know are averages of multiple tests (depending on the number of folds). Therefore, true positive, false negative, etc. data cannot be calculated from the reported sensitivities, specificities, etc. (For example, in a study of 100 cases with 2-fold cross validation, if one validation instance results in a 49% sensitivity and one in 50% sensitivity, the average sensitivity would be 49.5%, meaning TP would be 49.5 which is not logical). Should such studies be excluded from meta-analysis? If not, how should the data be extracted/calculated?

More Ali Nowroozi's questions See All
Similar questions and discussions