I have a fasta file with a few hundreds of dissimilar protein sequences. Since they all come from homologous proteins, but outside of a conserved domain they all share, I was wondering if perhaps those dissimilarities in the regions outside of the conserved domain could have arisen by frameshift in their DNA sequences. I could probably find this out by making a bunch of tblastn queries, but I was wondering if there is an easier way to check for pairs of sequences which are dissimilar in their protein sequence, but would have a similar, but frame shifted, DNA sequence.

More Lucas Bleicher's questions See All
Similar questions and discussions