I never used QM / MM. The base used depends on the phenomenon you want to study. In my last article, I simply used the DFT with the 6-31G * basis. In other articles I've used other bases.
In general, a set of basis 6-31G (d) is used to biological systems and LAN type pseudopotentials for transition metals. Moreover, when charged molecules is necessary to add diffuse functions.
When performing calculations QM / MM is necessary to make the division between the region and the molecular quantum away from the site of reaction (> 5 bond) if you use link atom.
I think GROMACS had an interface with ORCA, but I don't know how user-friendly it is. I've always used ORCA with pDynamo. http://www.pdynamo.org
A double zeta basis set is a reasonable compromise between accuracy and speed. You can later perform single point calculations with a larger basis set. But remember that other setting can influence the result to a much larger extent than the basis set or the DFT functional: size of QM region and starting snapshot now come to my mind.
to interface Gromacs and Gaussian you will need the source code of Gaussian to modify and compile accordingli. Look at the bottom of Gerrit Groenhof web page (http://wwwuser.gwdg.de/~ggroenh/qmmm.html). He shows how to do this.