I read that we can't compare p value of different studies as long as they are below 0.05, their value does not matter. However, this contradicts with the classification of highly significant, significant, very highly significant. Any ideas?
Instead of comparing p-values you should compare the effect sizes. For correlations for example, May and Hittner (1997) are comparing different tests http://www.tandfonline.com/doi/abs/10.1080/00220973.1997.9943458#.VLFh1nv3I-o
I have my doubts whether the classification of p-values does make any sense at all. p-values rely on sample size. I would rather look at a combination of power and effect size in order to evaluate whether your results are "highly significant"
Heba, I think that's a very interesting question. I do not think I will give a clear answer, but I think I may emphasize the problem, which may, I hope, help with the answer. As others have correctly suggested P-value is a function of sample size. On the other hand, that would not answer a question like "If the P-values from two independent studies which have used identical sample sizes are (say) 0.01 and 0.002, respectively, can we say the results of the latter study is MORE significant?" (I do not know whether your question includes what I have asked, but in any case my question seems to be a valid question). In this case, if we convert the result related with P-values to effect size, then the latter effect size is higher than the former one. Hence, we seem to "indirectly" compare P-values.
The problem, in my opinion, seems to be related to the meaning of the P-value = P{Observation|H0 is true}. Though we repeat this definition over and over again, we still (somewhere deep within) believe that P-value ~ P(H0 is true|Observation}. Hence, when P-value is below a threshold (call it =.05, 0.01, anything), we use the second (wrong) definition to justify ourselves in believing that H0 is false (contrary to H1) and comparison of two "insignificant" p-values (meaning that we have already "proved" the falseness of H0), in this case, would be meaningless. However, if we do not adopt this view, then comparison of P-values (for the same sample size in each comparison - Thanks to James for this correction-) seems plausible to me.