In the feature selection process I need to compute the ranking of the attributes using either entropy or the mutual information. Consequently I need a tutorial for both entropy and mutual information if available. Thanks in advance.
Entropy is a concept some find difficult to grasp, but in truth it does not deserve such notoriety. Look upon Entropy as a road map that connects thermodynamic situations. This tutorial hopes to shed some light on this subject, by approaching it from first principles.
The attached paper may be of interest. This paper contains problems and solutions on entropy, relative entropy and mutual information. The detailed solutions are quite helpful.
A nice, recent thesis on information theoretical feature selection schemes (often the same principle/objective function is referred to under many different names): http://www.cs.man.ac.uk/~pococka4/publications/pocockPhDThesis.pdf
For the estimation stage: ITE (Information Theoretical Estimators) toolbox, https://bitbucket.org/szzoli/ite/
It is important to know how maximization of relative entropy links to Bayes theorem. Entropy is a general concept and it can be used to update probability density functions, not only update the probabilities. Seek for open access paper of Adom Giffin titled "From Physics to Economics: An Econometric Example Using Maximum Relative Entropy". This paper has some examples.