I'll examine three groups of people's perceptions through 5-point Likert scales. My level of measurement is ordinal in nature. And I want to see the significant difference across the three groups. What is the most suitable statistical test?
Just to be clear, Atikhom, a Likert scale is not just the 1-5 rating scale: it is the sum, or average, of several 1-5 rating scales. The idea is that a random overstatement on one agree-disagree question is likely to be compensated with a random understatement on another, related, agree-disagree question. With 4 to 6 items (questions/statements) that are all variations of the same underlying construct, averaged then you get reasonably consistent results.
While, technically, the 1-5 rating scale is an ordered scale, the average of several such items (a Likert scale) gives you a scale that "approaches" interval-scale properties. So, for all practical purposes, you can use regular parametric statistics: mean, standard deviation, etc. In your case, with three groups, you'd run ANOVA.
If you need to compare the 5-point scales one at a time, then non-parametric statistics are more appropriate.
To compare two groups use the Mann-Whitney U test.
To compare three or more groups use the Kruskal–Wallis H test.
But then again, you almost certainly will find that your conclusions from the Kruskal-Wallis test will be the same as if you ran the more straightforward one-way ANOVA. See for example:
de Winter, J. and D. Dodou (2010), Five-Point Likert Items: t test versus Mann-Whitney-Wilcoxon, Practical Assessment, Research & Evaluation, Vol 15, No 11.
Just to add something small to Hume's excellent answer:
If you wish to treat several questions as measuring the same construct you need to validate them first. The best technique for this is Principal Component Analysis (you cannot simply take the average of several items and expect your construct to be reliable). If this analysis is successful, the scale we recommend using is the regression model, not the average.
Comparing three groups using such a validated scale takes you into an ANOVA analysis or a non-parametric equivalent (such as the Kruskal-Wallis test).
Comparing three groups with individual 5 point questions will not work very well with the Kruskal-Wallis test because of the coarseness of the response variables. The fall-back test for this is chi-squared but you would still need to check it for validity. Often, lower frequency Likert values can be combined together into a single category (such as "Strongly Disagree to Agree") in order to satisfy the validity requirements.
Please see my study guides for more information on all the above.
In the case of @PeterSamuels 's point # 3 ---- that is, tests on responses from individual Likert items: Probably the best approach is to use a technique that respects the ordinal nature of the dependent variable.
Ordinal regression is a great option. This is as flexible as common (OLS) multiple regression, but uses a dependent variable which is ordinal in nature (as Likert response data is).
Also there is a Cochran-Armitage test that is used for one ordinal variable and one nominal variable. In the original implementation it can be used only when there are only two levels in the nominal variable, but some software implementations allow you to use more than two groups.
In general, I don't recommend using a chi-square test of association for Likert data, because it looses the ordinal nature of the responses. * Collapsing the data into categories like SA/A vs. N/D/SD also looses data, and so is not the best approach.
__________
* That is, a chi-square test knows that 1 is different than 2 is different than 3, but ignores that 1 < 2 < 3.
I was wondering if you have any good sources for the idea that traditional nonparametric tests (e.g. Mann-Whitney, or K-W in this case) aren't suitable for use with ordinal data with many ties, like the common case of Likert item responses ("Coarse" data as you put it).
The argument I've seen for these tests being not suitable is that the original Mann-Whitney test was based on a continuous dependent variable.
Another argument is that ties are likely to be very common in this case, and the test won't handle this well.
On the other hand, it's pretty common for relatively reputable or knowledgeable people/sources to recommend using e.g. M-W for e.g. Likert item responses.
These sources usually simply assert that the dependent variable must be at least ordinal in nature. A typical case might be the description of the test in Wikipedia (https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test#Assumptions_and_formal_statement_of_hypotheses).
I don't know that I've seen the concern about many ties addressed. All I can say, is that playing around with toy data, these tests don't seem too upset about many tied values, and the results aren't too far from more established tests (Cochran-Armitage or ordinal regression).
for variables measured on liker scale ideal would be non parametric test like Mann whitney U test for (median comparison) when you have 2 groups. More than two groups to compare medians Kruskal Wallis test can be used and pair wise significance comparison mann whitney test has to used (as posthoc for ANOVA)
Dear all, I found this discussion very interesting.
I also had a query related to such data as
I have assessed "Cardiovascular Diseases (CVD) Risk score" in a sample of participants by using two different tools (FRS and WHO). I divided their CVDs risk in four categories as ordinal data i.e. 30%. I would like to compare proportion of participants in the four CVDs risk categories (as described above) of the two tools (FRS and WHO)." The data table is attached herein.
Which test would be most appropriate for this case?
Sir, for comparing proportions of participants in four CVD risk categories, Chi square test for proportions should be employed. This would give four p-values column wise.
Dear Ravi, for clarification: two tools have been applied on the same population and tools measured the risk in terms of scores. Later scores have been classified in the categories. So, I think, the real question is, a). should we classify the population in the categories; b). considering the categories more than 2, should we go for Cochran's Q test; c). literature also suggest for Stuart–Maxwell test, or Liddell's exact test; d). more conservative we should go for Wilcoxon signed rank test for paired observations?
The most suitable statistical tests for ordinal data (e.g., Likert scale) are non-parametric tests, such as Mann-Whitney U test (one variable, no assumption on distribution), Wilcoxon signed rank sum test (two variables, normal distribution), Kruskal Wallis test (two or more groups, no assumption on distribution).
Hi, guys. Were you aware of the difference between scale data nad ordinal data. Anova or Kruskal-Wallis test can deal with scale data and norminal data with multi groups properly, but for ordinal data, those tests may not be suitable.
To add to the discussion and comment on some excellent answers by Hume and Peter, the Mann-Whitney U test and Kruskal-Wallis H test are very versatile non-parametric methods but expect quantitative data that follow a distribution. The 'coarser' your ordinal categories are, the more you depart from this quantitative assumption. For instance, 5-point or 3-point questions on the disagree-agree scale are already coarse representations (and, moreover, one might argue how quantitative/ordinal is a granular disagree-agree question at all). Given this, maybe a simple x2 test is suitable enough - provided that you cover the prerequisites, which mostly have to do with a minimum number of frequencies across all categories (I think 5) for every group. There are also alternative statistics to Pearson's chi squared that make this test more versatile (check for instance this scipy function: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.power_divergence.html#scipy.stats.power_divergence)
To comment on the response by Dimitriοs Bouziotas : If we are thinking about analyzing, say, responses to a single 5-point Likert item, ordinal regression would probably be the ideal approach. However, tests like Mann-Whitney or Kruskal-Wallis will work quite well (assuming the procedure is implemented in a way to handle ties well). I have some comparisons between ordinal regression and Mann-Whitney for simulated data here: Article How Should We Analyze Likert Item Data?
Using a chi-square test of association ignores the ordered nature of the ordinal variable. This may be desirable in some circumstances, but usually isn't.