I am working on histograms because histogram is a very parsimonious way of storing a distribution of observed values. In order to overcome the problem of the choice of the width of bins, I devised a method where, chosen the desired number of bins, the domain is binned into bins that have different width. OK, nothing new, just a piecewise interpolation of the distribution function. But I intend to compare the procedure against other methods. The kernel density estimation looks the more competitive (a lot of references stated its superiority in being consistent and having a fast convergence rate to the "true" density). However, I performed a test to asses the superiority of KDE vs "my histogram". I generated 1million points from a mixture of two normals. I performed the KDE with the RBF kernel storing the distribution function estimated into 500 points. I estimated "my histogram" with just 16 bins. Well, after that I simulated 10k random queries about the probability of an interval of values. Whit my great surprise, "my histograms" is more accurate (MSE) in the prediction of the probability than the KDE.
So, my question is: "Is it more correct (and\or useful) to predict density or probability?"
If you are curios of that, I have implemented the procedures in MATLAB.