I’m working on plotting GO terms for my proteomic dataset, and I have some trouble understanding the p-value of DAVID, so I hope someone could help me. Briefly, we had treated cells with a reagent and looked for specific PTM modifications, but since we couldn’t enrich for the PTM due to a lack of established enrichment protocols, we ended up with a set of only ~50 modified proteins. So I put this set of proteins into DAVID, set the p-value threshold to 0.05, and obtained a list of GO terms.

When I try to plot this, I’m following the convention of using -log(p.adjust). From my understanding, p.adjust here would be the Benjamini-corrected p-value, so I used that. However, most of my -log(p.adjust) values are now very low (between 0 and 1). I know that this is most likely due to the low number of proteins in the set. So my question is: Is the list of GO terms using the 0.05 threshold statistically significant (since they made the cutoff)? If not, how important is -log(p.adjust) in this case and how high should these values be to be considered statistically significant?

I know that the GO terms are not an endpoint and should be followed up with experimental data to confirm any connections. I’m just trying to figure out whether the GO terms that made the 0.05 threshold mean anything or do they have to have a high (unsure how high to be considered good enough) -log(p.adjust) as well.

Thank you in advance!

More Tin Pham's questions See All
Similar questions and discussions