How do I calculate entropy of a bitstring?

More Ihosvany Camps's questions See All

Best software to edit phylogeny trees?

I need a program that allows me to display the bootstraps without them overlapping with the branches. Figtree doesn't allow to move the data. Nor do the softwares allow me to edit from one to...

18 January 2022 6,996 7 View

Is there any software to randomly funcionalize sufaces?

I would like to know if there is any software that can be used to generate molecular surfaces functionalized/decorated with other molecular groups. For example, I want to start with a given...

06 October 2021 4,817 4 View

Creating reference point in sample surface?

Hello all, Suppose you are visually scanning the sample surface to select a region to do AFM scan. You select a region at first sign looking good. Then you move to another region. Then you see...

03 May 2018 8,785 3 View

¿What is the scientific evidence of the recent term coined "emodiversity", in emotion research?. ¿What has the research confirmed yet and still not?

Because of it´s interest, and implications for practitioners, intervention design and so on,...I´m willingness to deep into his actual state of scientific evidence, and to open maybe the...

28 September 2017 3,204 2 View

CRSYATL14 in parallel (single node/multiple cores)

Hello, I had installed CRYSTAL14 in a server with 24 cores. Trying to do parallel calculation (multiple cores, same node), started a series of files transfer to the same server asking every time...

02 February 2017 8,787 2 View

Is anyone have exprience on the measurement of reducing sugars and starch potato by NIR spectroscopy?

NIR spectroscopy

10 September 2014 1,426 3 View

How can I create my own fragment library to use with LigBuilder v2?

I am trying to use LigBuilder v2 (http://ligbuilder.org/) with my own fragment database created from a set obtained from Molinspiration (http://molinspiration.com/). I ran LigBuilder but none of...

20 May 2014 6,794 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Using OBD technique i am trying to measure laser induced shockwaves velocity i found that at start velocity increases and then decay?

i am unable to interpret why its increases in start as shown in figure

11 August 2024 2,179 1 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is it possible to plot the atom-projected band structure using GPAW?

Hi, I'm currently working on a project where I need to plot the atom-projected band structure using GPAW. I've been able to calculate the band structure for my material, but I'm having trouble...

07 August 2024 269 3 View

Should I include H atom into C3N5 when i am doing DFT modelling?

Hi all, my experimental XPS results shown that my C3N5 sample consists of N-H bond, hence in this case I should incorporate the N-H bond into my DFT modelling. However, I do notice several papers...

07 August 2024 8,414 2 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Humbert G. Díaz

Dear Camps

You can calculate letter-level mean Shannon entropy independent or depending on sequence. Sequence-independent mean entropy can be calculated as the Sh = SUM[-(pi)·log2(pi)] where the probs pi for each i-th letter can be determined relative to the frequency of the letter in this text (genome, message, book, etc.) for sequence dependent entropy or graph entropy (sequence is a linear graph) you can use a Markov chain approach to calculate the probs. We have published together with Prof. Cristian R Munteanu and released the software S2SNet to doing both kind of calculations. Please, send me an email if you are further interested on it. See some refs:

1: Munteanu CR, Magalhães AL, Uriarte E, González-Díaz H. Multi-target QPDR classification model for human breast and colon cancer-related proteins using star graph topological indices. J Theor Biol. 2009 Mar 21;257(2):303-11.

2: Munteanu CR, González-Díaz H, Borges F, de Magalhães AL. Natural/random protein classification models based on star network topological indices. J Theor Biol. 2008 Oct 21;254(4):775-83.

Abdelhalim abdelnaby Zekry

Dear Ihosvany,

If you have a text with variable length words then you can consider the word with different length is symbol. If we have N words in the text and we have I words then we will have I symbols, then we can calculate the probability PI of occurrence of every symbol by its frequency divided by N.Then we can calculate the entropy of every word length= - LOG2 PI / (PI)

By summing over all possible word lengths in the N word message we can get the overall entropy.

Best wishes

M. A. A. Al- Fatlawi

Following

Best regards

Anatoliy Platonov

Some strange question.

Why do you say nothing about probabilities of the code words or conditional (transition) probabilities from word to word, if they are not independent?

If you don’t have these data, you cannot compute any entropy.

It is unclear sentence “Shannon's Entropy is related to a set, not to an individual”? Entropy of a single individual (particular) message does not exist (is zero). If the source generates only one message (signal, sign, letter or fingerprint...), its uncertainty and entropy is always ero.

If you have probabilities of code words pi (i=1,…,M), proposal of Colleague González (I didn't look whom I am speaking with) is true and SUM[(-pi)x·log2(pi)] defines the entropy per word (i.e. entropty of the source of code words ). Entropy per symbol is : SUM[(-pi)·log2(pi)] / SUM(Li)x(pi), where Li is the length of i-th code (in your case M=4). That’s all what you may compute.

Ihosvany Camps

Thank to all for your contributions.

I will read the works suggested by Humbert G. Díaz

Dear Humbert G. Díaz and Abdelhalim abdelnaby Zekry :

The Shannon entropy is defined for a set, group of elements, not for only one element. Using it, as it, can not take into account the internal diversity of the binary word. For example, the binary words 11110000, 10101010 and 11000011 all will have the same Shannon entropy whereas the internal order is very different.

Dear Anatoliy Platonov :

I have binary words that represents the fingerprints of molecules, so, each word is not probabilistic, it is just a codifications of the molecular structure into a set of 1's and 0's. What I need is to compute the entropy of each molecule, in order to know which one is more disorderly. Of course that I could use some modeling tool that compute the "real" entropy of the molecule (MOPAC, GAMESS, Gaussian, etc.), but this task will be very expensive for a great number of systems (and large systems).

I think that this work could help (suggested by David Quesada in another forum):

https://arxiv.org/pdf/1305.0954.pdf.

In this work, the author defined a BiEntropy and a logarithmic weighting BiEntropy taking into account the internal order/disorder of n-bit strings.

Dear Camp

The sequence-dependent Shannon entropy of order k-th associated to a Markov chain (Shk) we used is not defined in this way.

In the case of the sequence-based Shk entropies calculated by S2SNet algorithm the three sequences 11110000, 10101010 and 11000011 have clearly different values of entropies.

In fact, the examples you mention are very similar to SNPs (single nucleotide polymorphisms) or intra-chromosome gene orientation patterns for instance. In both cases, we have been able to discriminate this kind of sequences and use Machine Learning to predict external properties (biological function, etc.) of these sequences with the same frequency of letters but different sequence pattern.