Normalization is done to map the data to a uniform scale. For instance, when the inputs to ANN are on widely different scales, normalization is normally used to get the same range of values for each of the input features. To do this, several standard data normalization techniques such as min-max, softmax, z-score, decimal scaling, box-cox and etc. are available (This list is not exhaustive and many more techniques are in use). As far as I know min-max technique preserve all the relationships in the original dataset exactly. Z-score is often used when responses are on different magnitude scales. But both techniques are sensitive to outliers in the data.
Is there a general guideline to determine the appropriate technique for a particular application? Should the normalization method be solely determined by the range of input features (for removing scaling effect)? Does it depend on the choice of activation functions (logsig [0, 1] or tansig [-1, 1], etc.) as well? Does it depend on the type of the problem we are trying to solve (classification, function approximation, prediction, forecasting of time-series data, etc)?