my dataset consists of highly similar sequences of proteins. i only wanted to remove duplicate sequeces if any in the data set. what threshold shall I set in CD HIT?

More Sansrity Sinha's questions See All
Similar questions and discussions