I am currently trying to find homologues of a protein I am working with, but BLAST has been giving me nothing useable. I have now found a dataset of 1500 protein sequences of potential candidates that I want to align to my reference sequence. I have tried Clustal, Mega, Muscle, MAFFT and pretty much everything under the sun, but with this many sequences and only limited experience, I am having trouble achieving what I want to do, as the programs simply crash or lock up after a few minutes..

Instead of the traditional multiple sequence alignment, where every sequence gets aligned to every other sequence with multiple iterations, I want all of the sequences from the dataset to only be aligned to my one reference sequence. Think of it as doing 1500 pairwise alignments only. What would be the best way to perform this kind of alignment?

More Michael Adams's questions See All
Similar questions and discussions