I am doing multiple sequence alignment using ClustalOmega for 600 dna covid-sequences with input characters around 30574 characters for each sequence. I am running it on windows cmd. I defined parameters in input of maxseqle = 37000, however, the output each time gives duplicated length around 75000 charcters per each sequence.

clustalo.exe -i Allseq.fasta --is-profile --use-kimura --seqtype DNA --maxseqlen 37000 --threads 8 -o myclustalv3.fasta

How can I solve this problem, what will I define in input parameters. This problem occurs with large no of sequences 600 sequences. In test of 40 seq, it gave me normal output length around 32000 characters but when align all 600 sequences together, it gave this duplicated length results.

Similar questions and discussions