How do we design a neural network and decide on the number of hyper parameters like the depth and number of neurons in each layer. Also how do we decide on the activation function to be used in each layer precisely
1. You can use thumb rules and start with the simplest architecture. For example, only one hidden layer with the half of same number of input neurons. Several authors use different thum rules in order to maintain few parameters
2. Other methodology is based on Cross Validation and grid search, where you evaluate several parameters in terms of the output results.
3. There exists alternatives based on optimization strategies, where you define an objective function and search for the parameters minimazing the scope, for example in this work it is employed evolutionary algorithms for this purpose https://scholar.google.ca/citations?view_op=view_citation&hl=th&user=95AliYAAAAAJ&citation_for_view=95AliYAAAAAJ:WF5omc3nYNoC
4. Now there is available many tools for autotuning, like for example matlab, which show the results for different configurations.
When designing a neural network, you first have to focus on what kinds of data you have and what expected results you need based on that data.
Neural network architecture is basically a structure where input, hidden, and output layers exist. In this case, you can observe after the input what kinds of patterns hidden layers are captured, and the output will be based on those important hidden layers.
Now, you can do network surgery, which means you can simply add or remove layers from the existing architecture to get the optimal performance. However, try on a benchmark dataset first. If the performance is better than the SOTA results, then it should work on your data also.
Choosing an activation function means making the neural network non-linear whatever is important for adjusting the gradients during the training to learn complex functions.
You might consider reading several books to gain a deeper understanding, such as:
Machine Learning: A Probabilistic Perspective by Kevin Patrick Murphy.
Deep Learning by Ian Goodfellow.
Deep Learning for Computer Vision by Rajalingappaa Shanmugamani.