I have been exploring separately machine learning, markov chains and statistical mechanics/information theory. All these topics seem to be extremely interconnected, but since they come from very different disciplines is hard to get some answers.

The concrete question is:

Say you have a phenomena F with a entropy quantity S. Now let the model M be a "fairly good and sufficiently compact" model of F. How much information does the model contain?

For example, a model could be F ~= M = x^2. That means x^2 contains information about F, but how would you measure this?

Trying to understand this. Thanks in advance!

Similar questions and discussions