A possible insight on the role of sufficiency in information measurement?

05 April 2017 2 3K Report

K&L actually defined what is effectively the "ultimate" sufficient statistic, which, in signal processing lingo, is called the log likelihood ratio (LLR). The LLR is the "ultimate" sufficient statistic because it is precisely the instantaneous information content available from the data bearing upon the specified binary decision (that is, the data can tell you nothing about the binary decision of interest that the LLR cannot). The generalized form of SNR (the symmetric form of the KL divergence) is then the associated average information content (the structural equivalent of entropy). It turns out this way of measuring information content (using LLRs, which I call "discriminating information") measures the same basic "stuff" that Shannon ("entropic information") does, but using a different measurement scale [like Kelvin rather than Fahrenheit], developed for the context of a binary decision rather than for the context of a discrete communications stream. The former is much more general (due to the generality of the underlying context), while also providing the LLR as an instantaneous measure (i.e., no ensemble averaging), a critical element missing from entropic information. If you are interested in references exploring the structure of discriminating information in more detail, feel free to contact me at [email protected].

Peter Harremoës

I am definitely interested in this aspect that you mention. I have some papers where I use the signed log-likelihood of random variables to give very tight bounds on the tail probabilities. This approach works excellently for binomial distributions, negative binomial distributions, Poisson distributions, Gamma distributions and inverse Gaussian distributions. These classes of distributions form exponential families with simple variance functions, but one can also get bounds on tail probabilities of hypergeometric distributions that do not form an exponential family, and there seem to be some hidden structure or some general inequalities that we still do not know yet.

John J Polcari

See "An Informative Interpretation of Decision Theory: The Information Theoretic Basis for SNR and LLR" for rigorous development of the data components of discriminating information for binary decisions. See "An Informative Interpretation of Decision Theory: Scalar Performance Measures for Binary Decisions" for rigorous development of non-data components (i.e., "prior information") and how information flows through the actual binary decision process - this leads to (what is to me) a much more useful method of measuring decision performance than traditional ROC curves, since it is scalar, and thus can be maximized directly. Cites of both are available on my RG site, which will lead you back to IEEE Access where they were published.

I have a further working paper exploring the specific relationship between discriminating information and entropic information {not yet published because the first set of reviewers weren't sufficiently familiar with discriminating information to really understand what I was getting at) that is not currently posted. I have not yet written up the application of discriminating information to classification (one of N choices) and estimation (one of a continuous set of choices) problems, but it is quite straight forward (you just analyze the "oring" operation on the underlying set of binary decisions). Right now, I am starting to unravel the extension to inferential decisions (or decisions of action, such as "should I act?" rather than "is something present?")

Most importantly, if you are interested in tail issues, I should write up and pass along some noodling I have done on what I call "LLR generating probabilities" For me, this arose in trying to understand the required structure of the two probabilities associated with any LLR statistic, which is extremely constraining and I am sure I don't fully understand yet. Will be happy to do so if you are interested - just pass me a link as to where you would like it delivered.

Cite

Top contributors to discussions in this field

László Attila Horváth

Ilie Sandu

Academy of Sciences of Moldova

Natalia Cherednychenko

Valentino Straser

UPKL Brussels

Vjacheslav Nagorny

Sumy State University

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

What is the difference between mathematical R^4 space and physical 4D unit space?

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

Controlling for pupil light reflex when analyzing pupil size time course?

What are a “Farmers Producer Organization” (FPO) and its essential features?

Strugglling with m6A dot blot any suugesstion ?

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How to get moment output in Abaqus Standart?

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Hello researchers Is this a random laser or just fluorescence?

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

Is it possible to plot the atom-projected band structure using GPAW?

Should I include H atom into C3N5 when i am doing DFT modelling?

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Are there any good simple systems or platforms to recommend?

"A Markov-like Model for Patient Progression"?

If we are using snowball sampling technique, how do we justify the true representativeness of the sample statistically? is there any statistical test?

Could dyes amplify the spectrum of light to a specific wavelength?