I wanted to detect the terpene synthase gene (TPS) in Eleusine coracana (Finger millet) plant, whose genome size is 1195.99 Mb, by the bioinformatics tool as it is not reported yet. The problem which I am facing is that in the whole genome sequence scaffold number is given which is starting from scaffold154380. Scaffold 1 is also present. So from where the genome sequence is getting started and why they are not in the order scaffold 1, 2, 3,4,..........
As I wanted to find the motifs (Pfam) I was translating the genome sequence scaffold by scaffold through the ExPASy tool by which I am getting three reading frames. Which reading frame should I take for further analysis?
Moreover, there are 525,627 scaffolds in the sequence. It will be a tedious job for translating each scaffold and then searching desired motifs in them. Is there any tool or server which can be used so that we don't have to manually paste each scaffold.
Please guide me on how I can do this work as my bioinformatics basics are not very strong. I will be highly obliged if anyone could help me.