it's not obvious, but I propose to visualize the extracted features using multi output networks, it helps to set the layer where the features are extracted to be seen, and then you can compare the extracted features
Several methods can be used to assess the good quality of the features extracted by a deep network. You can maybe start visualising the extracting using some visualisation method (e.g. t-SNE), normally if your features are good, you should see the same structures as for the inputs, for example if you look for images, the images representing the same objects should be near to each other.. Another method to assess the god quality of your embedding is by investigating it capacity to classify elements of your dataset using a simple classifier (SVM, logistic regression...)