It is obvious that when you use a dynamic docking algorithm that such thing would eventually happen. My question is how to justify the results and/or how to avoid this?
More detail, less confidence. In protein-protein docking, I would primarily focus on domain-domain and subdomain-subdomain interaction and identifying the interfacial amino acid residues. I would assess the reliability of the model using experimental (mutagenesis/structure/function) data on the mutants of the interfacial amino acid residues.
As for some finer structural details, e.g., amino acid residue rotamers, the question is whether HADDOCK can relax the protein-protein complex fully? Probably not, which is supported by your observation; in such a case, molecular dynamics simulations may help (i) to relax the complex, (ii) to evaluate the binding strength and contributions of individual amino acid residues to the binding.