In one article, authors said that classifier accuracy must be improved after applying data scaling. Why is that not possible to my experimental results?
Sometime it happens and there are many hypothesis you could consider about. But before that I would suggest a couple of checks.
1) Check feature scaling parameters
Basically, there two types of feature scaling:
[X - mean(X)] / [max(X) - min(X)]
[X - mean(X)] / std(X)
where X should be the whole data set. Both usually work if they are applied to the whole data set and not on train set / test set separately. Someone performs feature scaling separately on train set and test set. It doesn't works as mean(X) != mean(Xtrain) != mean(Xtest) and std(X) != std(Xtrain) != std(Xtest) ...
2) Check you're not dealing with categorical features
Feature scaling is a transformation usually applied to continuous features (e.g. apartment square meters). If you're dealing with categorical features (e.g. male/female), feature scaling is not recommended. In this case I would suggest categorical encoding.
3) Check if for the model you're using it's recommends feature scaling
Feature scaling is not recommended for any model. Typical models for which feature scaling is recommended are the gradient descent based ones (logistic regression, linear regression, ...), KNN (especially with euclidean distance), SVM (especially with radial kernel) or NN. But this is not the case for all models. For example, for Boosting and Additive Trees feature scaling and/or more general transformations are not an issue, and they are immune to the effects of predictor outliers.