Usually Number of hidden layers , optimizers, activation functions etc. are all considered to be the hyper-parameters of a specific neural network model. For a given model architectuere(sequential or non-sequential), one can find the optimal values of HPs through some HP tuning mechanism against some accuracy measure(s).