Due to the large amount of sequence data deposited in the databanks, when I perform a BLAST search, I often end up with a list of proteins whose sequence is very similar among each other. I'm wondering if there is a way to reduce the number of very similar sequence. For example, when there is a group of 10 sequences sharing 98% of amino acid identity, automatically keep just one of those sequences and leave aside the other 9. This would leave more space in the result list for more (evolutionary) distant proteins resulting into more broad and less crowded phylogenetic trees.

More Gianluca Molla's questions See All
Similar questions and discussions