The universal answer do not probably exist. Z-score is good for data which have only normal distribution. You can test it by Kolmogorov-Smirnov test (KS-test), but it is very strict. Median normalization (?) is probably universal for non-parametric dataset, but outliers can cause bias of training. This problem solve nonlinear normalisation by sigmoid atc., however important information can be in the outliers. Better can be basic min-max normalization in this case. So, I prefer input data adjustment/transfer to normal distribution and use Z-score.
1. Z - Score: Standard scores are also called as follows: z-values, z-scores, normal scores, and standardized variables. The use of Z is because the normal distribution is also known as the Z - distribution. They are most frequently used to compare a sample to a standard normal deviate, though they can be defined without assumptions of normality. Z - Score is often used in the Z-test in standardized test.
In the subject of statistics, K–S test is a nonparametric test of the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution when the sample is one sample, or to compare two samples. The K–S statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The two-sample K–S test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples.
In this case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using these to define the specific reference distribution changes the null distribution of the test statistic.
Unfortunately and as said by Radek Janca here before, there is neither general answer nor "always-to-be-applied" method.
Moreover, normalization is highly dependent on the original data and might not always improve the performance of your classifier or ANN compared to the use of the original or "raw" data.
My recommendation is that you test with both options (try some of the methods discussed by other repliers, or other ones) and later check the results, obtaining the best or the most realistic ones.
Normalization means casting data set to a specific range like [0,1] or [-1,+1], but why we do that, the answer is to eliminate the influence on one factor (feature) over another, for example you have the amount of olive between 5000 ton to 90000 ton, so the range is [5000, 90,000] ton, in other side you have the temperature ranges from -15 to 49 C, the range is [-15, 49]. These two features are not in the same range, you have to cast both of them in the same range say [-1,+1], this will eliminate the influence of production on the temperature and give equal chances to both of them.
In another hand, gradient descent algorithm GDA which is the backpropagation algorithm used in neural networks converges faster with normalized data.
If all features lay in the same range then no normalization is required. One of the drawbacks of normalization is when the data contains outliers (anomalies), because this will aggregate most of the data in a very small range and only outliers will lay on the boundaries.
Z-score, is a standardization method also used for scaling the data, its useful for data contains outliers. It makes the data to has zero mean and standard deviation =1.