In stanford CNN course, it is stated that we can use the available Convnet for transfer learning. (http://cs231n.github.io/transfer-learning/)
I saw that some guys use VGGnet for transfer learning. The basic idea is to use the feature maps from the last conversational layer as features. All densely connected layers are retrained using the training data.
My question is: as we know that in VGGnet densely connected layers contain the large proportion of the parameters, is it possible to train the VGGNet model mentioned above? With such a large number of parameters, I guess that the model is easily overfitted.
If anyone have experience in transfer learning using VGGNet, please give me some helping comments. Thanks.