Almost all article about any feature descriptor compares its performance with a bunch of other known descriptors. For example, see the following article for the kind of performance measure used.
Also, the performance of these feature descriptors depends on the kind of scene you are interested in. For example, some might be better than others on natural scenes but worse in low light or gray scale images. You obviously need to define clearly what you mean by "reliability" in order to compare descriptors for this measure.