Dear Colleagues,

I am seeking guidance on a specific issue regarding the output of a blast analysis. In the results, I frequently encounter multiple gene IDs that share a similar sequence, but are appended with additional suffixes, such as ".1", ".2", etc. For example, I may see both Solyc09G000462|Solyc09T000462.1 and Solyc09G000462|Solyc09T000462.2.

After examining these gene IDs, I have observed that they possess similar protein sequences. My question is: which gene ID should I consider as the primary or representative sequence for further analysis? Should I retain all instances or prioritize only one?

Your insight on this matter would be greatly appreciated.

Similar questions and discussions