I am using a dataset for classification consists of nearly ~20k images.
I want to train them faster as a result I used EfficientNet. But the performance keeps dropping while I am using better versions of the architecture consists of a greater number of parameters. cross-validation acc was 0.836 for version B0 when 0.732 for version B5.
Can anyone explain to me why increasing of parameter making the result fall?