Hello, Could you please share any interesting research explaining how to choose the number of hidden layers and nodes per layer in case of regression problems using ANN? Thank you, any help would be much appreciated.
typically you would use an automated procedure like grid search cross-validation or randomized search cross-validation to tune the hyperparameters of your regressor (in this case, the number of hidden layers, the number of neurons of each hidden layer, the activation function of the neurons, etc.).
Have a look at the documentation for grid search cross-validation in Scikit Learn:
At the bottom of the page you can find several examples on how to apply this procedure to different datasets and classifiers / regressors.
Basically, you need to pass the function your feature matrix and target. You also need to pass, for each hyper parameter, the range of values you want to evaluate. The function will then evaluate each combination of hyperparameters via cross-validation and it will return the best performing set.
See Kolmogorov's Theorem. It gives the condition for approximating any nonlinear function with a function composed of two functions, which is the case of a neural network with two hidden layers.