I need to calculate information theoretic quantities like entropy, mutual information, KL-divergence for continuous variables. Where can I find a software package for this calculation?
David thanks, actually I have seen some R packages like the package entropy. I need some insight about their superiorities and drawbacks. Also they seem to use discretization before calculating entropy.
I am curious if any method/package/software available to calculate those quantities without discretization or the recommended dicretization technique in case no such software is available. (I know the question title reads different, but my aim was to see alternative software, so any suggestion will be appreciated).
Hiqmet thanks for the reference. I understand that there is a deep literature for the calculation of entropy and mutual information of continuous variables. Do you have any insights about performances of different methods or any other codes?
Mehmet, If I understand your comments you would like to to do the computations without using a discrete approximation to continuous variables? If so have you tried a symbolic math package or do I have this all wrong? Best, David PS There are packages for both R and Python as well as the commercial ones.
David, yes that's correct. I suppose that it would be preferred without discretization. Can you name those packages and their usage on entropy calculation?
Mehmet, See the links below. The first few entries from the link are the most important. There are some articles that compare packages here also. Remember though that a good approximation(need ways to check this) can be almost as good as the exact answer esp. if the exact answer is hard to impossible to compute(e.g. not all anti-derivatives exist in closed form). This is the point of numerical analysis. I would also look at a good reference in this area. The next to last link points to a favorite of mine. I hope this helps. Good luck and Happy New Year. David BTW I would suppose their use in entropy calculations would be to do things like evaluating integrals etc. The last link is to my favorite entropy book.
Thanks David. I guess I need to use numerical integration to find entropies etc.
But you know the more important thing is to know the underlying distribution of present data. And generally one does not know the distribution of the data beforehand. Therefore it seems that we need some ways to identify the underlying distribution and then calculate information theoretical quantities. There are some kernel density estimation techniques for such a goal. I am curious
i) If this is the right way of calculating those quantities (without discretization, but finding the distribution and then calculating the numerical integral)
ii) if there are different ways for finding the distribution of that sample data.