If we take a data set, the primary steps to checking the normality assumption and outliers. suppose, a huge amount of outliers can occured in the data set. how do we find the limitations of outliers?
well, based on advanced statistical analysis, for data more than 30, there is not required to normality test be done. instead, the normality test can be drawn. also, in cases similar to your question, the more much outliers data can be removed from dataset. the data more much 3-4 times than average of dataset can be ignored.
This depends upon your purpose, and how you determine what you consider to be an outlier. An outlier is either a datum that includes an unacceptable amount of measurement error, or was out-of-scope, that is, it did not belong to the data population with which you intended to work. If you look for cases in the tails of your distribution, you may incorrectly assume good data are outliers, and leave in 'true' outliers that are not so far out in the tails, but that may be unavoidable to some extent. There is no substitute for careful data collection, but if, say, a number were obviously collected in the wrong units, for example, then one 'bad' data point may overwhelm any results, and one could hardly let that stand.
Note that normality is most common when the central limit theorem comes into play, or when looking at estimated residuals from regression, but even then it is probably not often very crucial. Many different distributions occur naturally. It depends upon your application.
In the energy establishment survey finite populations in which I worked, if data editing, using different methods, including scatterplots to compare data sets, found very suspicious data, the energy industry establishment was contacted to ask them to confirm. If "good" data could not be found, then imputation or reweighting of data was important, or one would otherwise effectively be substituting a zero, which would bias estimated totals downward. I suppose that in a laboratory experiment from an effectively infinite population, removing an outlier would be like doing one less experiment, but it may have been for a unique condition. In any case, the real sample size is reduced by one, and standard errors of parameters grow, and population standard deviations, though constant, are determined less accurately. Also, if a data point that was considered an outlier was collected badly because of conditions under which you could not consider it to be missing at random, then not collecting that data will bias results. Perhaps there was a reason that that datum was hard to observe. Stratifying or categorizing your data by characteristics that are common within that group, but different from others, can dampen the impact of such bias.
Basically, if outliers occur at random, and can be effectively recognized, the only problem remaining is your smaller sample size. So you are "limited" in that your sample size is smaller, and you have to assume outliers occur at random, or else group data more homogeneously. You may want to research "response propensity" groups.
You could try an experiment: Identify outliers at three levels based upon how suspicious they are, even if all you can do is to look in the tails at three different cuts. See how much difference it makes to whatever you do with your data. (Report this information in an appendix, perhaps.) This may give you some basis upon which to judge.
Quite a difficult question, I think even with a small quantity of outliers they could mislead to wrong probability distribution fitting. So the question is not only about normality but also about correctly defining how data is distributed. If you are suspicious about data quality and you are expecting a probability distribution type to take place and suddenly you are having extreme type distributions as the best fit I recommend to pass an outlier detection test first i.e. the lag plot test, Median Absolute Deviation test or the Grubbs test. We developed one that use the Kullback - Leibler distance as well.