Hi,

Lets give a vivid example to explain my question: suppose we have a list of universal genes including both oncogenes (OG) and tumor suppressor (TS), and the purpose is to test the enrichment of conditions when we select some genes by a method (which we want to test its randomness or not). Here, the "method" refers to a highly selective approach which is capable of identifying the cancerous behavior, i.e. the so-called method tries to identify as many oncogenes as possible and tries to avoid the selection of TS. Here, a possible way to analyze the predictive ability of this method is to employ hypergeometric p-value; for oncogenes, this would be perfect. However, for TS, it is a bit misconducting (at least for me), since as the "number of successes in the sample" drops, the p-value increases. Hence, lower number of TS in a sample (which clearly shows the high predictive power of the method), wrongly indicates insignificant hypergeometric p-values. 

So, my question is: how to reconcile this situation and modify this problem? Is it reasonable to report 1 - calculated_p_value ?

Thanks in advance.

More Oveis Jamialahmadi's questions See All
Similar questions and discussions