It is important to state the sequence coverage of an identified protein and the matching peptide details eg., experimental vs calculated peptide values with the delta mass, in ppm. You also need to state the p value of the search parameters. If you have an FDR setting, you should also state that. The database size and detaila used for searching the data is also necessary since it determines the false discovery rate of the search results. Since sequence coverage depends on the size of the identified protein, it is more important to state these details than look for an acceptable value -
In one of the top proteomics journal, Molecular and Cellular Proteomics, there are guidelines for publishing data that for protein and peptide identification. See the link below - I have also included the link to the checklist.
When you perform peptide mass fingerprinting using MALDI-MS, is the machine you use capable of performing tandem MS (MS/MS fragmentation), or a you basing your identification and sequence coverage solely on peptide masses?
In case of the latter, the mass accuracy (for example after calibration using tryptic autodigest fragments?) and number of different peptides are important, I guess. MS/MS fragmentation is of great added value and greatly increases the confidence of the peptide/protein ID.
@Eef Dirksen, I'm referring to the second case. Actually, I have yet to ask regarding the machine used (as we just sent samples to a different lab), but as far as I know identification was based only on peptide masses, and they used MASCOT server for PMF.
I have some additional questions:
1) Is it really necessary that the first match on the list having the highest mascot score is the correct protein?
2) What if there's more than one protein match (different proteins), all having the same score and same number of peptides matched, which one should be chosen? and on what basis?
The protein(s) at the top of the list are the most probable identifications based on the peptide mass data you provided.
Protein homologs or isoforms can be identified based on the peptides that they have in common. It is also good to check which organism they originate from, as you probably know what source you used for your samples. These will all end up in a single protein hit in Mascot, just like you describe.
If possible, you could share your Mascot search results, maybe that helps with explaining some of the details.
an important consideration for the point you ask is if the sample you are analyzing belongs to a model organism or a non-model organism, because this greatly changes the game - especially as you are running automated algorithms for peptide matching against databases
does your organism have a good genomic/transcriptomic/proteomic database?
in order to improve confidence you may additionally perform digestions with different proteases (if you have enough protein to start with), e.g. first digestion with trypsin, second with Lys-N or Glu-C (or others). It is also possible to perform first a chymotryptic (make a first MS with this!) and then, with the same sample, a tryptic digestion (second MS). This can decrease peptide size, reducing problems with long peptides (other combinations of proteases in such a sequential approach are of course possible, too).
All this may help you to increase the sequence coverage and, when you are lucky, can give you the chance to differentiate between different isoforms, find protein modifications, etc..
In rare cases it can also help to change the matrix used for MALDI MS, e.g. using 2,5-dihydroxýbenzoic acid rather than the usually employed CHCA. Very good alternative: Chloro-CHCA!
The hints given by Hediye in te first answer are definitively something you should take into account!