I noticed that when calculating precision and recall for validation results, if I merge all images together (ignoring the batch dimension), the accuracy can be very high. But if I calculate precision and recall for each sample in the batch and then sum them, the result is only 88%. I know this is related to the distribution of positive and negative samples. In the field of ice lake detection, which method is more appropriate?