I have downloaded a dataset of 2086 proteins from RCSB with certain criteria. I want to remove redundant protein sequences. My aim is to remove the PDB IDs that have almost same protein sequences(having only two or three mutations) . Please help me if anyone knows how to do it.

More Bondeepa Saikia's questions See All
Similar questions and discussions