I have built a hybrid model for a recognition task that involves both images and videos. However, I am encountering an issue with precision, recall, and F1-score, all showing 100%, while the accuracy is reported as 99.35% ~ 99.9%. I have tested the model on various videos and images (related to the experiment data including seperate data), and it seems to be performing well. Nevertheless, I am confused about whether this level of accuracy is acceptable. In my understanding, if precision, recall, and F1-score are all 100%, the accuracy should also be 100%.
I am curious if anyone has encountered similar situations in their deep learning practices and if there are logical explanations or solutions. Your insights, explanations, or experiences on this matter would be valuable for me to better understand and address this issue.
Noted: An ablation study was conducted based on different combinations. In the model where I am confused, without these additional combinations, accuracy, precision, recall, and F1 score are very low. Also, the loss and validation accuracy are very high on other's combinations.
Thank you.