Neural Networks: What are "good" representations?

02 February 2016 3 9K Report

When training Neural Networks (e.g. FFNNs, Restricted Boltzmann Machines), lots of regularizers and other tricks can be applied to improve the results. Such tricks as well as different training parameters can totally change the look of the neurons' filters (i.e. weights) of a trained model. That means, depending on the training of the NN, the data will be differently REPRESENTED by the neurons and data is also differently DISTRIBUTED over neurons. We know that sparsity and selectivity and weight regularization may help, but we conclude this by looking at the results instead of having a "measure" which is calculated from the parameters.

Question: Does anyone know about publications about what are "good" representations (calculated from the weights and biases over all neurons)? How should the data be distributed over the neurons - is it better the data gets perfectly compressed (most data in least neurons) or perfectly distributed (all neurons should be used very balanced for representing data). Are there other statistical analysis over filters? How should representations look for discriminative tasks vs. generative networks, etc.? How does the quality of the data relate to "good" representations (e.g. does the model size follow "intrinsic dimensionality"), etc.?

Thank you for your answers!

Dr. Indrajit Mandal

ANN is considered as a good tool for certain type of problems where you are not aware of the underlying relationships between variables.

The weights of the links between neurons are computed based on the mapping between input variables and the output variables.

Its a kind of linear relationship that a ANN is trying to build over a data set. Depending on the sigmoid or any other learning function you chose, the computation for the weights takes place.

There is a problem: The weights are non interpretable. That's the reason it is not applicable to many engineering applications like financial predictions etc.

There are so many studies on the same regarding its distributions and related stuffs but not able to prove any theory on the same.

I hope this helps.

From

Dr.Indrajit

Stefan Lattner

Thank you Joachim, but this is exactly the opposite of what I want. Still, this might point out the problem. Autoencoders can be trained without sparsity regularizer and will that way learn a simple identity function with a perfect mapping from input to output but with very poor internal representations. I am not interested in looking at "good" filters but in analytical measures towards "good" representations (e.g. a formula which would give me a low value if I do not regularize an autoencoder and learn trivial solutions).

Best - Stefan

Stefan Lattner

Dear Dr. Mandal,

thank you very much for your answer. You mentioned there are many studies to this topic. Could you please name some of them?

Thank you,

Stefan

Which software tools are best for enhancing diagnostic accuracy in chest X-ray imaging using image reconstruction and neural networks?

How can I extract the mathematical equation from existing Neural Network Model?

What is the current status of augmented learning in robotic surgery?

How can I improve the purity of NPC cultures derived from human iPSCs during neural rosette selection?

Is it possible to use neural network models for prediction if the sample size for the time series is very small??

What is information diffusion in the social network?How a message got viral in social network?

In CNN, is the feature map obtained randomly by convolution kernel?

How does a Man-in-the-Middle (MitM) attack work in the context of Transport Layer Security (TLS), and what specific mechanisms can be employed ?

How to reduce the number of measurements/iterations needed in deep reinforcement learning?

Can we use SHAP values to explain the performance of a Neural Network ?