your question is related to the magnification property of the SOM. Exact results are only available for the one-dimensional case, i.e. a neuron chain with one-dimensional inputs. Then, as explained by Helge Ritter and later more precisely by Dersch&Tavan, the density of the weights is proportional to the data density taken to the power of alpha. Alpha is the magnification exponent/factor. It should be 1 for optimum information transfer (optimal coding). For the above described case this value is between 1/3 and 2/3 depending on the (final) neighborhood range in the learning process. You can control this behaviour either by the deSieno method (mentioned in a previous comment by A. Dekker). Another possibility would be local learning rates for each neuron (see a paper about magnification control by Bauer/Der/Hermann). For the higher-dimensional case only general experiments are known (see Merenyi et al. 'Explicit Magnification Control of Self-Organizing Maps for "Forbidden" Data', IEEE Trans. NN, 2007).
If you are not interested in visualization but simply in clustering another way would be the neural gas vector quantizer introduced by T. Martinetz. The magnification exponent for this method is generally known as alpha=d/(d+2), where d is the (effective/intrinsic) data dimension( i.e. Haussdorff dimension -> can be estimated using the Grassberger-Procaccia-Algorithm). Hence, for larger d, this vector quantizer has the desired property. For lower d, again deSienos method or local learning works (see 'Magnification Control in Self-Organizing Maps and Neural Gas' in Neural Computation 2006).
That's the standard behaviour for SOMs with the modifications due to Desieno (see p. 69 of Robert Hecht-Nielsen's book Neurocomputing (Addison-Wesley, 1990).
since the SOM is sort of a non-linear projection, it already tends to distribute the neurons uniformly (at least in comparison to PCA, etc.). To my experience it is quite unusual that 'node counts' differ by more than a ratio of 2 on average. If you want to achieve an even more uniform (identical?) distribution you might add an additional constraint to the best-matching-unit calculation function. But take care: this won`t improve the 'quality' of the new SOM. Thus, keep the quality trade-off in mind.
In original SOM number of instances in each unit is not uniformly distributed and naive algorithm does not force this constraint per contra to K-means algorithm. It is also available some modifications having such constraints on the algorithm. You might check this out with some toy datasets. SOM will be dividing instance space into divisions in which each region might have a crows of instances as the other division has only a couple of instances.