I think sequence similarity is not the most important in this case. Given that the goal is to perform HPLC You should keep more attention to physicochemical similarity between these proteins, e.g. solubility, solvent accessible area, hydrophobicity etc.