It was rather successful, because significant. It tells you that you do in fact have enough data to confidently see that the residuals are not sampled from a normal distribution. This is what this test is done for. It would be not that good if the test was not significant, in which case the only "conclusion" could be that the data is insufficient to conclude that residuals are not samples from a normal distribution.
However, I assume you used that test for something this test cannot do: to see if the residuals were sampled from a normal distribution (and if so, that you can use an analysis that assumes normal distributed errors). It does not make sense to check this using hypothesis tests (that will reject H0 or not; and if they don't the only information you have is that your data is not conclusive regarding H0! -- non-significant results must not be interpreted, and caliming that the errors are normal because the test was not significant is just such a no-go).
You should understand the character of the response (is it a proportion, count values, amounts or concentrations, rates, ...) and think of a sensible distribution model. You should have some understandable logical and scientific arguments. And you must assume (argue) that the functional model is flexible enough to approximate the data reasonably well (if you assume a linear relationship but the "true" relationship is non-linear, the model may be too bad and the distribution of the residuals will then be meaningless, anyway) .
If you then see that the carefully chosen model does not fit the data well and/or the residuals clearly do not behave as expected, you should go back and rethink the model (functional and/or stochastic part).
My data about the number of tourists and the model is SARIMA.
When I tested the residuals of the model they were ( qqnorm, qqline) all good except skew= 0.2092527 and Shapiro-Wilk normality test ( p-value = 6.657e-07).
Why would you even expect that the number of tourists could be normal distributed? Counts are discrete and bounded at a lower value of 0; the normal distribution is unbounded over all real values (has non-zero probability for negative values). There is considerable theory about distribution models of count variables. Rather than examining if you have enough data to reject the null that the variable is normal distributed, it would be much more interesting to get an idea of the mean/variance relationship (and also possibly if there is a zero-inflation).
The Shapiro-Wilk test is a statistical test used to assess whether a given dataset follows a normal distribution. If the Shapiro-Wilk test indicates a failure to meet the assumption of normality, it suggests that the data significantly deviates from a normal distribution.
When the normality assumption is violated, there are a few options to consider depending on the specific context and goals of your analysis:
1. Assess the sample size: The Shapiro-Wilk test tends to be more sensitive to deviations from normality with smaller sample sizes. If your sample size is small, it's worth considering whether the departure from normality is substantial enough to affect your analysis. In some cases, deviations may not have a significant impact.
2. Consider the nature of your data: Normality assumptions are more crucial in certain statistical analyses, such as parametric tests like t-tests or ANOVA. However, some analyses, like non-parametric tests or robust statistical methods, are more robust to deviations from normality. If your research question allows for alternative analysis approaches, you may consider exploring these options.
3. Transform the data: If the deviations from normality are not extreme, you might consider applying data transformations to make the data more normally distributed. Common transformations include logarithmic, square root, or reciprocal transformations. Transformations can help stabilize variances or make the data conform more closely to a normal distribution. However, keep in mind that the interpretation of results will be based on the transformed scale.
4. Utilize non-parametric tests: If the normality assumption cannot be met or if you prefer to avoid data transformations, non-parametric tests can be used. Non-parametric tests, such as the Mann-Whitney U test or the Kruskal-Wallis test, do not rely on the assumption of normality and are suitable for analyzing data that does not meet this assumption.
5. Seek expert advice: If you are unsure about how to proceed or need guidance specific to your research or analysis, consulting with a statistician or an expert in your field can be beneficial. They can provide tailored recommendations based on the nature of your data and research question.
Remember that statistical tests are tools to aid in data analysis, and the significance of deviations from normality should be considered in the context of your specific study design, sample size, and research objectives.
Assess the sample size: The Shapiro-Wilk test tends to be more sensitive to deviations from normality with smaller sample sizes. If your sample size is small, it's worth considering whether the departure from normality is substantial enough to affect your analysis. In some cases, deviations may not have a significant impact.
It's the other way around: Tests of normality become too sensitive to small departures from normality as the sample size gets larger.
Bruce Weaver , you are correct in pointing this out. But I think Chuck A Arize meant it in a different way, like:
"The Shapiro-Wilk test tends to be more sensitive to deviations from normality with smaller sample sizes than other normality tests."
I added the bold part. This is the usual statement about the Shapiro-Wilk test: it is more powerful than other tests, and this is particularly relevant when the sample size is small.
After all, using tests to test the suitability of assumptions is flawed and illogical. The purpose of tests is not to find out whether or not H0 is true but to find out if the sample size is large enough to interpret the direction of the difference between the estimate and the hypothesized value of the (test) statistic. And being able to recognize a difference does not automatically imply that the (still unknown!) true difference is relevant for the problem at hand. And being unable to recognize a difference does not automatically imply that (still unknown!) true difference is irrelevant.
This leads to your final statement: small samples may not allow you to see relevant departures from normality, and large samples will always allow you to state that the sample is drawn from a non-normal distribution with high confidence, but with no indication if the kind and amount of departure is relevant. A classical lose-lose situation, so to say :)
Bruce Weaver , I have no clue if Chuck was really making this distinction :) I just think that it can and probably is meant that way. I hope my nitpicking is motivating people to be most careful in using correct expressions/wordings (ha, ha: how is this coined in English correctly?). It's so damn easy to get misunderstood and to unwillingly promote misconceptions in statistics.
Jochen Wilhelm and Bruce Weaver , I think attributing what Chuck A Arize might have meant could be a lost cause. I put his answer into https://gptzero.me/ and it replied "Your text is likely to be written entirely by AI." Maybe he can clarify this.
You'd hope people who use the LLMs would at least recognize how often they make errors and proof read them (and of course cite that they are using them), but I think the people using them often do this for topics that they do not understand.