When I am training deep learning model with different architectures, some time I have to change the batch size to prevent from Resource Exhausted Error condition. Is comparing the performance of these models trained in different batch size an issue or not?
I see lot research paper where batch size is defined fixed.