Hello,

I have two samples of proteins. The data represent the presence (1) or absence (0) of each protein in each sample, such as I obtain a table like in the image below, with a protein either present in one (0 1 or 1 0) or both sample (1 1), with the names representing my proteins and the clubs my samples.

My goal is to have a statistical test that would determine if the samples, containing hundreds of proteins, are homogeneous (that is, there is a lot of proteins present in both samples) or heterogeneous (few or very few proteins in common). Right now the only thing I could do is just a percent share between the samples (for example 30/142 in common, which is 21%) but I am being asked for a statistical test that would show a probability, between 0 and 1, that the sample are homogeneous or heterogeneous.

Thanks !

Similar questions and discussions