K&L actually defined what is effectively the "ultimate" sufficient statistic, which, in signal processing lingo, is called the log likelihood ratio (LLR). The LLR is the "ultimate" sufficient statistic because it is precisely the instantaneous information content available from the data bearing upon the specified binary decision (that is, the data can tell you nothing about the binary decision of interest that the LLR cannot). The generalized form of SNR (the symmetric form of the KL divergence) is then the associated average information content (the structural equivalent of entropy). It turns out this way of measuring information content (using LLRs, which I call "discriminating information") measures the same basic "stuff" that Shannon ("entropic information") does, but using a different measurement scale [like Kelvin rather than Fahrenheit], developed for the context of a binary decision rather than for the context of a discrete communications stream. The former is much more general (due to the generality of the underlying context), while also providing the LLR as an instantaneous measure (i.e., no ensemble averaging), a critical element missing from entropic information. If you are interested in references exploring the structure of discriminating information in more detail, feel free to contact me at [email protected].