The gradient reversal layer (GRL) as used in a neural network proposed by (Ganin et al) in the paper "Unsupervised Domain Adaptation by Backpropagation" performs well in approximating the marginal distribution of a labelled source and unlabelled source domain. Classifiers built on that can perform well on the target data.
Much emphasis has been placed on improving the accuracy on the target domain through the GRL and domain classifier. How does it affect the accuracy on the source domain? I've observed competitive performance with or without the GRL. Would appreciate if someone could explain the theoretical basis behind this behaviour.