kindly guide me how I can check normality of data as I have collected data by using 60 item scale (responses recorded on 5 pt likert scale) from 350 respondents. now I am confused that I have to check normality of data for each item or not?
Hi Sadia, some tests (most parametric tests, as f.e. variance analyses, linear regressions, etc.) assume a Gaussian distribution of continuous data. However, most of them are very stable against violations of this assumption.
A very conservative approach would be to apply the Kolmogorov-Smirnov test. For this, you open "Analyze" --> "Nonparametric Tests" --> "Legacy Dialogs" --> One-Sample Kolmogorov-Smirnov Test ("1-Sample K-S") and provide the variables, of which you want to assess whether they follow a Gaussian (normal) distribution. As test distribution, you tick "Normal" and press "OK". If with the Test statistics (Output file --> "NPar Tests" the Asymp. Sig. is statistically significant (0.05), the probability, that your data follows a Gaussian distribution, is very low. So you need at least a non-significant result (p>0.05) for a normally distributed variable. This is the most simple way to check for normal distribution, although it is very conservative, that means that the deviation from Gaussian distribution can be quite high and though the test would suggest a normal distribution. So if you check your variable by drawing a histogramm ("Graphs"-->"Legacy Dialogs"-->"Histogramm"), you might see that sometimes, your variable is quite deviated (to the left or to the right), although your K-S test is insignificant, which indicates a normal distribution. However, most of the statistical tests that require a normal distribution (parametric tests) are very resistent against violations of this assumption.
Another, more visual way would be using p-p plots (Analyze --> "Descriptive statistics" --> "P-P Plots"). You provide your variable, check if "Normal" is chosen at "Test Distribution" and press OK. The plots can be interpreted as follows: A variable is considered to follow a Gaussian distribution, if in the "Normal P-P Plot of [variable name]" the dots align (relatively) linear in a 45° angle (along the continuous line drawn from 0/0 to 1/1). In contrast, in the "Detrended Normal P-P Plot of [variable]", no systematic alignment (for example U-shape, etc.) must be detected. However, the is no test statistics for this approach.
Hi Sadia, some tests (most parametric tests, as f.e. variance analyses, linear regressions, etc.) assume a Gaussian distribution of continuous data. However, most of them are very stable against violations of this assumption.
A very conservative approach would be to apply the Kolmogorov-Smirnov test. For this, you open "Analyze" --> "Nonparametric Tests" --> "Legacy Dialogs" --> One-Sample Kolmogorov-Smirnov Test ("1-Sample K-S") and provide the variables, of which you want to assess whether they follow a Gaussian (normal) distribution. As test distribution, you tick "Normal" and press "OK". If with the Test statistics (Output file --> "NPar Tests" the Asymp. Sig. is statistically significant (0.05), the probability, that your data follows a Gaussian distribution, is very low. So you need at least a non-significant result (p>0.05) for a normally distributed variable. This is the most simple way to check for normal distribution, although it is very conservative, that means that the deviation from Gaussian distribution can be quite high and though the test would suggest a normal distribution. So if you check your variable by drawing a histogramm ("Graphs"-->"Legacy Dialogs"-->"Histogramm"), you might see that sometimes, your variable is quite deviated (to the left or to the right), although your K-S test is insignificant, which indicates a normal distribution. However, most of the statistical tests that require a normal distribution (parametric tests) are very resistent against violations of this assumption.
Another, more visual way would be using p-p plots (Analyze --> "Descriptive statistics" --> "P-P Plots"). You provide your variable, check if "Normal" is chosen at "Test Distribution" and press OK. The plots can be interpreted as follows: A variable is considered to follow a Gaussian distribution, if in the "Normal P-P Plot of [variable name]" the dots align (relatively) linear in a 45° angle (along the continuous line drawn from 0/0 to 1/1). In contrast, in the "Detrended Normal P-P Plot of [variable]", no systematic alignment (for example U-shape, etc.) must be detected. However, the is no test statistics for this approach.
Using SPSS frequencies you can ask for skewness and kurtosis. You will get a value for the statistic itself but also a standard deviation. Thus you can compute if the difference from 0 is significant or not. The difference between mean and median is also an indicator of skewness. Graphically you can insert a normal curve in your frequency graph.
Rather than checking each item, you should assemble the scale you want to analyze and assess that. Are you familiar with Cronbach's alpha? If so, you can access it under Scales: Reliability in SPSS.
Alpha only assumes normality to the extent that it relies on correlations among your items, so unless some of your variables are highly skewed, there shouldn't be any problem.
This is my second go at this because people are (somewhat naively) trying to address the issue of non-normality rather than scale construction and doing so without really considering that these are Likert scales that are being used.
In fact you want each item to have a non-normal distribution and you do not want to get rid of it. Imagine you are trying to get a good measure of disability and you have a 5 point likert scale for each of (say) 20 measures - you want a range of easy tasks ( to differentiate between people with low levels of disability ) and also also hard tasks ( to differentiate at the top end) - these items will be desirably skewed to the left and right - this is what you want .
There is a specific from of analysis that deals with this sort of problem - Item response Models in which the items are seen as nested within an individual. In my previous post I pointed to a gentle introduction - these books take it further
Embretson, Susan E.; Reise, Steven P. (2000). Item Response Theory for Psychologists. Psychology Press. ISBN 978-0-8058-2819-1.
de Boeck, Paul; Wilson, Mark (2004). Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach. Springer. ISBN 978-0-387-40275-8.
Fox, Jean-Paul (2010). Bayesian Item Response Modeling: Theory and Applications. Springer. ISBN 978-1-4419-0741-7.
These become progressively more demanding. Finally here is a very gentle introduction to a simple form of the model - the so-called Rasch model ( a 1 parameter model)
To re-iterate - skewed responses are good not something that needs to be cured but exploited and there are sophisticated and powerful procedures to do this.
In terms of free software R has lost of capabilities in this area
I don't think that combining LIkert scored items into a scale using Cronbach's alpha is inherently "naive" any more than it is always the case that using Item Response Theory to form a simple scale is "using a steam shovel to empty a sandbox."
Instead, the standard I would propose to see what others in your field are doing, and if Cronbach's alpha is the most widely used approach to creating scales, then do what your peer reviewers will understand best. Alternatively, if people expect the full complexity of Item Response Theory, then you will need to master that literature.
I have no problem with keeping things simple and exploring relationships between items using techniques such as Cronbach's alpha. My comments about naivety refer to those suggesting that the issue was just normality and ignoring that Likert scales and multiple items are being used.
I also think that people need to be at least aware that there are specific techniques are available for these sorts of problems and that IRT does reveal interesting things such as item difficulty/severity and discrimination as well as dimensionality..
Setting up practical guidelines on how to analyze these data is a very useful thing to do.
dear friend sadia normality test attempts to compare the shape of your sample distribution to the shape of normal curve. shapiro-wilks test is much more common due to sample size. for tests including small and medium samples up to 2000 we use s-w test and if the sample size exceeds over 2000 we take the kolmogorov-samirnov. it is worth mentioning that W is the test statistic. for interpreting the results after choosing the appropriate kind of test mentioned before due to your sample size, look at the word "sig" which is stand for significance of the test or in the other words p-value, if sig>0.05 accept the hypothesis H0 (normality) which means the test is insignificant. or simply your sample distribution follows normal curve.
There are two way of normality testing; numerically (statistics) and graphically. Statistics testing can help you to make objective judgement, whereas, graphically testing can help you to make subjective judgement. For statistics testing there are two normality tests; shapiro-wilks test, most appropriate for small and medium size data and kolmogorov-samirnov for large size data. To do the analysis;
Click on analysis-descriptive statistics-explore and then plot. in the explore plot click histogram for graphical testing and normality plots with tests for numerical testing. Moreover, in the explore click statistics and check descriptive statistics. Therefore, if the Sig. value for Shipiro-wilks is
Graphical assessment of normality: You have the possibility to produce a normal curve in your histogram produced with frequencies. Just tick the box under histogram. Computational tests: You can also do a statistical test, that means comparing the distribution of the scores obtained to a normally distributed set of scores with the same mean and standard deviation; the null hypothesis is that “sample distribution is normal”. The most used tests for the assessing normality are Kolmogorov-Smirnov (K-S) test and the Shapiro-Wilk test. They can be found in the SPSS Explore procedure (Analyze → Descriptive Statistics → Explore → Plots → Normality plots with tests).