My data has failed the Shapiro wilk test. Skewness appears normal but Kurtosis is not normal again. Please let me know if I can transform the data by some process to do a parametric test like ANNOVA
Non-parametric tests are the ones you can use when your data is not normal. You can compare parametric and non-parametric tests with the same data, and sometimes they have the same results.
ANOVA is more sensitive to skew, but relatively robust with regard to kurtosis. If your data meet the equality of distributions assumptions, then I would go ahead and use ANOVA.
In the event that the data is not subject to normal distributions, then there are several solutions, including transformations or nonparametric analysis, which does not require the normal distribution of the data.
First, you must consider the test to determine normality, there are several criteria that vary according to the statistical package you are using, for example Statistics uses Kolmogorov-Smirnov-Liliford (KS-L) and Shapiro Wilk's (SW), in this The KS-L case is less demanding than SW, therefore, depending on the need and level of xigence in your study, you can use any of the criteria, the two are scarcely used, and you must also consider the dispersion statistics using the statistical descriptive, where the CV, SD and EE take relative importance with respect to the rest, in terms of distribution, if the CV, depending on the origin of the data (% specifically), it is recommended that it be below 30%, if the variables are not in percentages, if variables are in percentages then they should behave between 30 and 70%. So:
If they are not in the mentioned ranges and the chosen criteria shows that they are not distributed, then we must transform.
If the distribution criterion shows distribution but its statistics are outside the mentioned ranges, we must transform.
If the statistics are within the ranges and the test does not show a distribution, then proceeding with the ANOVA under the criteria of descriptive statistics could be considered.
In all variants, one must resort to the homcedasticity test (Bartlett) and corroborate homcedasticity.
The transformations are adapted to the type of values it has, for example, for continuous values Sqrt = # is used, for extreme values (bacteria count) the logn = #, and in the case of values expressed in percentages the arcsine is used.
An apology for the length in the explanation, but it is a very interesting topic and of several aspects to consider in order to obtain and / or reliable results ... the last resort is the use of non-parametric statistics.
Shapiro Wilk tests determine if the population your sample comes from is normal, not the sample itself. Find the log or square root of the data set. If it still doesn’t satisfy, then you will use the non-parametric test like the Friedman or Kruskal Wallis H test. That is after checking the assumptions. If skewness is still normal, just move on. Best!!
If your data has failed the normality test, you can transform the data using a suitable function, such as log, square root, or cube root, to make it more normally distributed. You can also perform a nonparametric test, which does not assume a specific distribution for the population. If the sample is large enough, you can proceed with the analysis as many hypothesis tests are robust to non-normality with large samples. You may also look at the nonparametric version of the test you are interested in running.