It is easy to calculate the entropy on discrete numbers or categorical data, which si equal to minus the summation of( the probabilities x the log of each probability), the probability of a real number is 0, how to work around this problem?
Basically, calculating entropy and information on real numbers involves discretizing the real values into a finite number of bins.
The Methods section in the following paper gives a detailed description of the process:
Richmond BJ, & Optican LM. (1987). Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. III. Information theoretic analysis. Journal of Neurophysiology, 57, 162-178.
Another example can be found in this paper:
Reinagel P, Godwin D, Sherman S M, & Koch C. (1999). Encoding of visual information by LGN bursts. Journal of Neurophysiology, 81, 2558-2569.
A good general resource for this problem is the following book:
Rieke F, Warland D, de Ruyter van Steveninck, R, & Bialek, W. (1997). Spikes: Exploring the neural code. Cambridge, MA: MIT Press.
A system that would allow a state to have zero probability is by definition wrong. The "zero probability element" is not part of the system.So if this can happen, i would first question the system / probability distribution. If it should be the case (by some reasons I can not imagine), then the elements with zero probability should be excluded before calculating the entropy.
Jus rough hint: Conceptually, you should take the data as samples from a continuous parameterised distribution: say Gaussian, and use these samples for estimating parameters of this dirstribution (mean and variance) and then to compute characteristics of this distribution like entropy is (thich can be done analytically for Gaussian distribution).
Entropy shows the amount of uncertainty while for a real number which is deterministic there is no ambiguity.
the entropy for a data set containing real data(or samples of distribution) can be calculated by computing the PDF(probability distribution function) of that data set.
Basically, calculating entropy and information on real numbers involves discretizing the real values into a finite number of bins.
The Methods section in the following paper gives a detailed description of the process:
Richmond BJ, & Optican LM. (1987). Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. III. Information theoretic analysis. Journal of Neurophysiology, 57, 162-178.
Another example can be found in this paper:
Reinagel P, Godwin D, Sherman S M, & Koch C. (1999). Encoding of visual information by LGN bursts. Journal of Neurophysiology, 81, 2558-2569.
A good general resource for this problem is the following book:
Rieke F, Warland D, de Ruyter van Steveninck, R, & Bialek, W. (1997). Spikes: Exploring the neural code. Cambridge, MA: MIT Press.
Entropy is a measure of the lack of structure in the pdf (or histogram) of a data set. It is not a measure for an individual datum. That would make no sense.
You can use the continuous definition of entropy. Basically you just replace the sum with the integral over your space of system states. For calculating such an entropy you need a probability space (ground set, sigma-algebra and probability measure). If you can identify these objects you can try to calculate the integral directly (or at least approximately). One more word: of course the probability of one real number is zero, but not necessarily of a subset of real numbers (see standard books on probability theory)
A good introduction could be the chapter about "Differential entropy" in the book "Elements of information theory" by Cover, Thomas or the book "Abstract methods in information theory" by Kakihara.
Real number is an abstraction. In a real experiment, you never get real data, just rational one, because of experimental error. The same is correct for a numerical experiment as well. So, in fact instead of "real data set" you get a data set already distributed among a finite number of bins, with a bin size equal to the double error of experiment. Now, you just need to calculate an entropy of discrete data. :)
Real number is unobtainable, and this point is fundamental. There is no any devise or sensory unit which could receive/process/perceive real data with zero error, because this would require infinite discrimination ability. That is why the infinite entropy (which theoretically corresponds to the real data set) is unachievable as well.
The question of interest is the discrimination ability (error of measurement) of the system which is receiving your data in real world (e.g. how precisely can the neuron distinguish between input firing times). This real discrimination ability would determine the amount of information which is really contained in your data.
Entropy is relevant to the person calculating it, based on how much information he already contains about the data in question.
For example, if I know already that the real numbers are sampled from a fixed set (countable and finite set to be exact), say {2.2334, 9.23423, 1.22333243} then the precision or value of the real-number does not matter - one simply maps these real numbers into integers {1, 2, 3} and treat them as finite-integer data set. (This works because the said real-numbers always come from the fixed set, which is the case in some domains).
On the other-hand, if the real numbers are coming from an open-set (countable but infinite) and we know something about the nature of these real-numbers, for example {PI, 2*PI, 3*PI, 4*PI ....}, where we know that the numbers are multiple of PI (3.14159...) - then once again it can be treated as an integer set and calculate the entropy just as usual. (Since the entropy does not really care about "absolute" representation, but only "relative" representation)
The tricky part would be when we have real-numbers that are uncountable - then calculating the entropy of such set essentially boils down to trying to 'differentiate' one real number from another real number in that uncountable set, which is not possible, directly, because we cannot, for all practical purposes, distinguish real-numbers one from the another (with arbitrary precision) like we do integers. So, we try to map a function (Curve-fitting) for these kind of real-numbers, thus essentially mapping them to integers, with close-enough approximation. This is where the other suggestions of analysing the probability density functions etc. come handy. They act as the mapper-functions to bring down the real-numbers into integers. Alternately, If you want you can even use the regular curve-fitting methods to find a valid function for these numbers and use that to calculate the entropy (consider the fit-curve as the pdf), so to speak.
Peter Gmeiner has pointed you to the best resource on this topic, Cover and Thomas.
To add to that: if you want to compute differential entropies in practise, or simply want to see an implementation in code, I can point you to my open source Java Information Dynamics Toolkit (JIDT), usable also in Matlab, Python, etc; see https://code.google.com/p/information-dynamics-toolkit/
Some suggestions can be found in Porta et al, Physiol Meas, 34, 17-33, 2013 and references therein, especially in relation to the challenge posed by short data sequences.
Our young colleague from Ukraine is right, but only regarding the raw experimental point of view. It must be stressed that the theory of the probability of continuous variables is well established. The probability of a real number within a given interval is zero, since its “length” on the real-number axis is zero. This is an elementary and obvious fact, to which some prior answers are pointing to. That is why some differential “around” the chosen number must be observed. For an entropy of the distribution of continuous variables to be valid, two conditions must hold: (i) the distribution itself must be valid, i.e. normalized to unity—of course—with summation being replaced by the corresponding integration, and (ii) the integral by which the entropy is calculated must be a finite number. The latter condition may not hold even if the former is true, which is the same as for the discrete (countable) but infinite random variable. For details and examples on how to calculate the entropy of a continuous variable, some good text-book should be consulted (including the one cited above).