for example for approximation of a 2-D function (2-D Gaussian distribution) a neural net with 2 hidden layer with 3 neurons in each layer is a much better approximator than a one hidden layer neural net with 30 hidden neurons?(at least for the function I attached)
the shape of the function is attached.