I think you had a long protein sequence, which you input in GeneMark, and it outputs the 'data.lst' file. Now, I think you should retrieve these sequences from this file in FASTA format, and then BLASTp them against 'nr database' to find each predicted protein's sequence homologue.