I've been using the pre-trained Deep Learning models as feature extractors on a project involving chest x-ray images. Conventionally, we extract the features from the layer just before the Softmax. In that way, I could repeatedly find that the performance of AlexNet is way better than ResNet-50. Is it because ResNet has learned extremely-specific features from the ImageNet data in its deepest layer and struggles with the dataset under study?