Sigmoid and Tanh functions are not widely used today for hidden layers due to their characteristics at the extremes of the function. The shape of these functions near their upper and lower bounds poses a problem known as gradient fading, a phenomenon that hinders effective learning and may even make it infeasible in certain scenarios. Instead, it is very common to use functions such as ReLU.
"Sigmoid/Logistic and Tanh functions should not be used in hidden layers as they make the model more susceptible to problems during training (due to vanishing gradients). Swish function is used in neural networks having a depth greater than 40 layers."