I would say the hyper-parameters are most important things in the model. For example, number of hidden layers and number of unites in hidden layer, and regularization parameters. These parameters will decide the model is over fitting or under fitting on a specific data set.
An extreme example would be: if we set hidden layer to be 1 and 1 hidden unit, with regularization parameter 10 million. The model will provide nothing.
What I believe is that the central challenge in training deep architectures is to deal with the strong dependencies that exist during training between parameters between layers (or skills levels). To address this difficulty of the problem we must simultaneously have two questions:
- adapt to low layers to provide adequate information for the final adjustment (end of training) of the above layers,
- adapt the layers above to make good use of the final adjustment (end of training) of the lower layers.
But this will not be easy, the parameters still have to be chosen, yes it is only one way.
The question is unclear to me. You want to know the importance of hyperparameters or role of each hyperparameter. More or less all of the parameters are important but they also depend on the loss function or objective function you use. For instance, in image denoising, choosing the right receptive field is very important but while dealing with action recognition you can just start with any and trivially change it with respect to the layers you design. Optimizer such as Adam works better in image denoising and single image super resolution but in most action recognition cases SGD or Ada delta optimizer dominate. There are alot of other parameters with respect to the network you design and the optimization for each of them is necessary in order to achieve the optimal results. I would recommend you to read Deep Learning book by Ian GoodFellow. The parameters are explained in detail let it me the use of dropout ratio or early stopping it covers all and in easily readable manner
Hyperpameters are an essential part of any deep network that helps you to optimize the quality of the network. Change in parameters, helps you to get the desired results or soulution to your problem.
As typical deep learning libraries and frameworks instantiates deep learning algorithms with default parameters. Therefore, it is always recommend to do hyper-parameters search to find the optimum set of combinations. This is important as fitting an algorithm on different datasets requires different parameters values in order to fully converge and find the absolute (global) minimum.