I try to analyse similarity of short protein sequences (30-150 aa) to characterized proteins using NCBI Blast. Typical E-values of my queries range between 0.0001 and 0.97. Following an E-value definition at the NCBI website (see the link), would it be reasonable to consider all queries with E-value below 1 as potentially relevant to some biological function? It looks like less then one hit per query possible by chance alone is an acceptable situation, especially when a query with the top E-value of e.g. 0.3 has got more than one hit. If not, what threshold could be recommended? The goal is to shortlist biologically meaningful protein models. Thank you.

Similar questions and discussions