In my opinion, Z-score. This method preserve range (maximum and minimum) and introduce the dispersion of the serie (standard deviation / variance). If you data follow a gaussian distribution, they are converted into a N(0,1) distribution and the comparison between series (probabilities calculation) will be easier.
Please refer to: http://www.thinkmind.org/index.php?view=article&articleid=sensorcomm_2011_5_40_10238. This paper provides an idea of the application of z-score normalization in biometric features
I've tried Z-score and Min-Max and Z-Score is beneficial for my application. I think the problem is that my data matrix is sparse and lots of fields are zero. I'm wondering is there another normalization method preferably produce the normalized data in a predefined range.
The form of the discriminant function will also be important. A binary classification tree employing univariate splits won't have any need of normalization. The need for and type of normalization is completely dependent on the regression, the optimization procedure, that you use to fit the discriminant function.
I found a proper method experimentally I've used z-score and Min-max in two different application. Dear James you are right and I found proper normalization methods for different discriminant function, in one application I find Min-Max more efficient and in another I find Z-Score beneficial.
The type of normalization depends on the model that the data is fed to, so there is no universally best approach.
TF-IDF is a common attribute weighting scheme in text mining applications, as well as image analysis (when dealing with bag of visual words, for instance)
As for 0-1, Z-score and the likes - we have recently observed that they might either amplify or reduce some negative aspects of the curse of dimensionality (with regards to distance concentration and related things) - so you should always take great care when applying them to (sparse) high-dimensional data sets.
Thank you Nenad. Yes I've implement my algorithm without these normalization and I got bad result so I decided to perform normalization. But your comment is interesting because in some cases for example when I use Min-Max normalization and after that implement a dimensionality reduction technique(like PCA) the final data is unreasonable I mean for example just first 2 dimension(the original data was 40 D) produce near 99 percent of variance!
If you have a PHYSICALLY NECESSARY MAXIMUM (like in the number of voters for different parties in an election that cannot exceed the total number of voters) the best normalization method is dividing for the MAX so to have your data spanning the 0-1 interval, but it is mandatory you have REAL AND STABLE MAXIMUM (the same considerations hold for min-MAX) otherwise this procedure is very dangerous given the hyperbolic character of the ration (having at the denominator something that can change, even slightly, without control is a curse !).
In all the other cases z scores that clearly depend on the choice of an appropriate reference set to determin mean and standard deviation but, once this reference set is in your hands, allow you to judge immediately about the relevance of a single observation (Wow it is more than 3 !!!!)...moreover the use of z scores allows for a very straight elimination of systematic errors and drifts (by the way you have an uneliminable 'day effect' in your experimentation..no problem you plan the same proportion of 'treated' and 'control' samples for each day and normalize for the mean and sd of the day....).
In my opinion, Z-score. This method preserve range (maximum and minimum) and introduce the dispersion of the serie (standard deviation / variance). If you data follow a gaussian distribution, they are converted into a N(0,1) distribution and the comparison between series (probabilities calculation) will be easier.
Depending on the task objetives. For example; for neural networks is recommended normalization Min max for activation functions. To avoid saturation Basheer & Najmeer (2000) recommend the range 0.1 and 0.9.
Another possibility is to use the Box Cox transformation + constant to avoid the problem of the zeros
Thank you all for all your beneficial and supportive comments. I finally decide to use Min-Max normalization because of some other issues which are related to my research. In addition, I find it more beneficial practically.
In fact, i need a clarify information about this, actually i am having intention of working with a very large data set in which i will prefer min_max normalization technique, but i still want to be double sure that it is better for this kind of situation.
It depends on the aims of the study and the nature of the data. When in presence of a physical unescapable maximum limit (e.g. the in an election the number of voters for a candidate cannot exceed the total number of voters) the best (and most natural) normalization method is to divide for the physical maximum (so we can compare the election results in terms of relative frequency between a metropolis and a small village).
Similar considerations hold true for dividing by a signal we consider as invariant in terms of our goals and whose eventual variability comes from experimental or in any case external resosns (e.g. dividing H1NMR peaks for the signal relative to water or P-NMR for the free phosphorus signal). This works well in the above cases (water exceeds of many order of magnitudes any other molecule concentration) but generates a lot of (unpredictable and potentially very harmful) errors in less straightforward cases (e.g. dividing by the signal of an 'house keeping' gene..are you so sure it CANNOT CHANGE FOR BIOLOGICALLY RELEVANT REASONS ?)...
Classical z-score are very good and very powerful and do not ask you (apparently) any hard decision on the physical (or semantic) nature of the system but heavily rely on the ability to sample A REAL RANDOM (oe control, or healthy depending on the case) reference population to use as basis for the normalization that is in many cases a controversial task...
To choose a proper data normalization a priori to analysis can be quite difficult. We suggest to employ analysis tools, which are Independent of the Chosen normalization, which means they don't need any data normalization at all. Have a look at our zero-sum regression publications.
Article Reference point insensitive molecular data analysis
Article Scale-Invariant Biomarker Discovery in Urine and Plasma Meta...
You might want to consider the Proportion of Maximum Scaling (POMS) method. Here is a link to a good resources: https://www.frontiersin.org/articles/10.3389/fpsyg.2015.01389/full. The article is attached.