An amazing variety of "Boussinesq" approximations have been developed over the years.
In the application of Boussinesq-type models to non-constant depth (particularly to shoaling waves on beaches), the most common citations include
Peregrine, D.H., 1967. Long Waves on a Beach. J Fluid Mech, 27, 815-827. (See DOI link below).
A general family of approximations that unites and extends several earlier models is:
Nwogu, O. 1993. Alternative Form of Boussinesq Equations for Nearshore Wave Propagation. Journal of Waterway, Port, Coastal and Ocean Engineering, ASCE, 119(6), 618-638. (See link to Nwogu's home page below.)
This paper includes useful citations for approximations that are not included in the family of Taylor series expansions that Nwogu uses.
The nonhydrostatic effects are important when the vertical acceleration is non-negligible. Boussinesq (1872) describes mathematically this kind of equations when the vertical velocity component is included in the integrated motion equations. In the literature, contributions related to barotropic and homogene water have been published in the literature in the last 50 years ( e.g. Peregrine 1967; Abbot et. al. ,1978; Hauguel 1980; Shapper, 1986).
Boussinesq J., 1872. Theory des ondes et des remous Qui se propagent le long d?un canal rectangulaire horizontal, en communiquant au liquide contenu dans ce canal des vitesses sensiblement pareilles de la surface au fund. J. Math. Pures Appl., vol. 17, pp. 55-108.
Abbot M.B., Peterson H.M., Skoovgaard, 1978. On the Numerical Modelling of Short Waves in Shallow Water. Journal of Hydraulic Research, vol. 16, pp. 173Peregrine D.H., 1967. Long waves on a beach. Journal of Fluid Mechanics, vol. 27, pp. 815-827.
Shapper H., 1986. Ein Beitrag zur numerischen Berechnung von nichtlinearen kurzen Flachwasserwellen mit verbesserten Differenzenverfahren. Report No. 21, Institut fuer Stroemungsmechanik und Elektronischen Rechnen im Bauwesen, Universitatet Hannover. -203.
Jager (2006) it is a nice reference with a historical description and a comparison between Boussinesq and KdV works. So for the history, as is stressed in the question is a good reference.
But I do agree that the central reference when talking about dispersion and the deduction of the Boussinesq model is Peregrine (1967). The references for dispersion and dispersion models are countless, but I will just mention three concerning improvements in Peregrine model appearing in the last two decades.
Madsen, Murray and Sorensen (1991). A new form of the Boussinesq equations with improved linear dispersive characteristics. Coast Eng, 15, 371-388.
Belloti and Brochini (2002). On using Boussinesq-type equations near the shoreline: A note of caution. Ocean Eng., 29: 1569-1575.
Antuono, Liapidevvskii and Brocchoni (2009). Dispersive non-linear shallow-water equations. Studies in Applied Mathematics, 122: 1-28