I'm wondering how you decide on the part of the model to unfreeze. Do you do multiple experiments? Since the use of GPU is expensive, you must have some guidelines. Note: I know the relationship between size of dataset, how close dataset is to the original dataset and how much that impacts whether or not we train more layers. However, is there a rule involving the depth of the model to get the approximate layer? Example: Try unfreezing model starting from Layer number 169, or layers between 70-100
How much does one need to know specifics of the pretrained model? Can I use it without knowing the architecture? Thank you for your help!
I usually freeze the feature extractor and unfreeze the classifier or last two/three layers. It depends on your dataset, if you have enough data and computation power you can unfreeze more layers and retrain the model.
very much slow without GPU, wanted to limit the data set size choose the images of interested and freeze the feature extractor to the size , once model is created with classifier can be stop so that achieved with learning rate.
It helps to know the architecture of the pre-trained model, so you know which feature-maps to use and which to retrain. Generally the efficacy of utilizing a pre-trained model is that you don't have to start from the beginning. Many CNNs, for example, terminate in a dense layer, combining feature maps to give a a result whether for classification or regression. In an encoder-decoder system, that dense layer is what shapes the encoding, so it is useful to retrain this layer for your dataset. Retraining some of the last few feature generating layers (Convolution Layers) may help in the retraining.