Likert scale gives a fixed numerical value to a varied quality of perceptions that the participants usually tend to group under the same number/value. Is there any other scale that overcomes this shortcoming?
There are a number of fairly advanced alternatives. However, the major weakness of Likert-type scales/data lies in precision responses that do not and cannot reflect actual responses. One can significantly improve the validity and soundness of analyses of likert-type data simply not treating responses corresponding to precise numerical values with precise numerical distance. Fuzzy probability, for example, and fuzzy numbers, are but a few of the ways in which one can use Likert-scale data without pretending that linguistic responses correspond to precise numerical values. Fuzzy numbers and fuzzy set theory are, of course, not the only method used to diminish the artificially contrived numerical values and distances of one's typical Likert-scale data, but they are one commonly used method and many another method uses similar means and logic to overcome the false correspondence of linguistic responses to infinitely precise numerical values, and they are among the set of techniques that do not require scrapping a research project in order to develop one that is better suited to other statistical methods; any experimental paradigm using Likert-type data can still be used by such methods as fuzzy set theory/probability and similar approaches.
There are a number of fairly advanced alternatives. However, the major weakness of Likert-type scales/data lies in precision responses that do not and cannot reflect actual responses. One can significantly improve the validity and soundness of analyses of likert-type data simply not treating responses corresponding to precise numerical values with precise numerical distance. Fuzzy probability, for example, and fuzzy numbers, are but a few of the ways in which one can use Likert-scale data without pretending that linguistic responses correspond to precise numerical values. Fuzzy numbers and fuzzy set theory are, of course, not the only method used to diminish the artificially contrived numerical values and distances of one's typical Likert-scale data, but they are one commonly used method and many another method uses similar means and logic to overcome the false correspondence of linguistic responses to infinitely precise numerical values, and they are among the set of techniques that do not require scrapping a research project in order to develop one that is better suited to other statistical methods; any experimental paradigm using Likert-type data can still be used by such methods as fuzzy set theory/probability and similar approaches.
Thanks Andrew. I will definitely study about the fuzzy set theory. However, I am at a stage where I can adopt an entirely new scale for my questionnaire. It is always better to take the best possible approach rather than a patchwork to cover the torn ends. I might as well, therefore, apply a completely new scale!
Andrew has made good points. One aspect that 'may' help get (a bit) closer to reality is to use a 1-7 scale rather than to use a narrower one like 1-3 or 1-5. The way the questions are articulated can also effect things. It may be worthwhile to look into "questionnaire design" literature - there is plenty out there on the web. Anyway, doesn't matter what efforts we make (which we must) in having a sound 'study design' to get inferences closer to reality - things will always be uncertain in varying degrees - nothing can (and should not be expected to) be perfect when it comes to responses from humans.
Subhash makes some good points (as do others). This devising own scales depends on the nature of the research. What is it you wish to measure. My concern would be validity of the measurement scale. Despite the obvious faults and concerns with Likert, 5 point and grouping around values many of the validated scales used have been subjected to countless thousands of validity checks and reliability checks in situations involving equal and non-equal variances assumed. Many also have been designed for psychometric purposes by chartered psychologists.
The processes, time and repeated experimentation required in designing and validating new measurement scales is quite a large task.
On a related point you may wish to read some literature on consumerism and market research where there are numerous examples of respondents answering to positions on scalar lines (yielding continuous data responses (1-5, 1-7 and 1-9 extreme values used here) - here the respondent positions their response on a line (analyst then measures to obtain continuous response value)
There are many good points raised. In modern times of computerisation or scanning technologies I cannot believe that scales of continuous properties are not used more often... e.g., a line scale as Robert suggest.
It often cuts down on the time of response BUT maybe respondents do not consider their response adequately as they speed through the items.
There has been a lot of work in educational research with bounded numerical scales. e.g., they may be a hybrid 1 to 10 and bounded by agreement/disagreement at end points. These have comparatively tested well.
It depends whether you are rating, ranking, classifying (diagnosing), etc., Many people take/borrow/adapt scales that may have another underlying purpose.
AND then there are the beliefs of the Rasch/Partial Credit Model/IRT people to throw in there...
The question that occurs to me is what degree of specificity are you willing to sacrifice in data collection? Obviously, a likert scale's bluntness is going to sacrifice that. On the other extreme, though, would be the option of conducting some qualitative/mixed-methods analysis. If participants are able to open-endedly respond to questions, then at this stage you have the benefit of a high degree of specificity, and then based on these responses you could code to conduct your analyses. The trade-off, of course, is this is going to be a much more time-intensive process.
I think it is important to be, well, mathematically rigorous when using terms like continuity in the context of statistics and data analysis. In particular, "continuous" as used in the context of e.g., discrete vs. continuous distributions is not the same "continuous" we find in "continuous rating scales". In my experience, many researchers using statistics have either not taken calculus courses or do not recall much from them, and in particular are wont to underestimate just how far from continuous almost all data in any science that might use likert-type scales are. The rationals, for example, are infinitely dense (between any two rational numbers, there are infinitely many more rational numbers). They are not continuous, of course, because there are infinitely many irrational numbers between an two rational numbers (without going into Dedekind cuts and a rigorous definition or derivation of the reals). Thus putting a scale from 1 to 100 with ticks every ten or twenty integers certainly can seem continuous (after all, the real number line is depicted this way), but respondents are unlikely to indicate multiples of transcendental numbers or irrational numbers. Perhaps the mathematician in my is being needlessly and overly precise (particularly because we talk about "approximating" uncountably infinite distributions with finite data points despite the fact that this is impossible and its impossibility irrelevant).
In addition to fuzzy sets, numbers, and probability, there are other quantitative methods designed from the ground up for data including those gathered by Likert-type scales. Item Response Theory (IRT) was big for a while and then sort of fizzled out, but has recently seen something of a comeback thanks perhaps to the incorporation of modern statistical methods as found in e.g., Multidimensional Item Response Theory (Statistics for the Social and Behavioral Sciences) or Bayesian Item Response Modeling: Theory and Applications (Statistics for the Social and Behavioral Sciences).
Then there are the other scaling methods, such as optimal scaling or multidimensional scaling. I tend to like MDS or related methods namely because I tend to favor classification & clustering algorithms, but while these are often good for categorical data in general, they loose much of their value when one uses a unidimensional ordinal scale.
Thanks Andrew, your answer gave me a lot of clarity on the matter... and some new avenues for a better methodology while preparing my questionnaire and study design..
Dear Paul, Thanks for giving a human color to this otherwise academic discussion.. I would be happy if my question can serve as means of communication between two long lost friends. Thanks for your academic inputs too.. I learnt a lot from your articles and explanations...
Can asking the respondent to mark the extent of agreement to a statement in the questionnaire as the 'level of water in a glass tumbler' or something similar to it instead of a likert scale and measuring the percentage of filled area show an actual reflection of the responses better than a likert scale? Will it help in eliminating the shortcomings? Does it reduce the fuzziness of the data and (it seems like giving even more precise values to the responses, but is that actual precision?) make the variability in perception as identified by a particular number lesser?
However, if we have 10 determinants of energy levels and we ask the people 10 questions - something like how much do they feel like jumping 10 times about one foot high right now? Will that be a precise assessment of 'feeling of energy level' in an individual...an in a group?
Also, would such a data be more precisely analysed eg by being classified into more precise fuzzy sets and would calculation of a systematic error in such a system not be more precise, so that at the end we get a more acceptable data interpretation?
I think you should consider using continuous bounded response formats (e.g. sliders on a web form, explicitly labeled with numeric bounderies, say 0% to 100% agreement).
Of course, though such a response in terms of percentage of agreement is continuous by nature, the real and observed response is artificially discretized by the screen definition. Not too much of a problem in pratice though and a number of results and interesting phenomena appear with such responses that are very difficult to get with Likert scales (I think of abrupt bifurcation from one end to the other of the response scale, for only a minor change in true attitude, resulting in a strongly bimodal response distribution).
You should have a look at those psychometric models recently developed for such responses (Beta Reponse Models, among others).
The Likert response format organises subjective responses into a categorical or at best ordinal system of responses. Rasch modelling developed by the Danish mathematician Georg Rasch permits the transformation of such responses into an actual interval scale. This has revolutionised measurement of self-report and rating scale data.
Another possible alternative other than Likert Scale is the Semantic Differential scale using Bipolar Adjectives (e.g. Effective1, 2, 3, 4, 5, 6, 7 Not Effective). The advantages include:
1) not dependent on the interpretation of words e.g. "somewhat" in some Likert Scales might mean a lot or a little to different respondents.
2) respondents can score Bipolar Semantic Differential Scale in a relative manner e.g. if they feel more strongly about a question than the previous question, they can mark / select the higher scale score accordingly
3) some scholars still treat Likert Scale as ordinal data in which non-parametric analyses / tests need to be used. However, observing some researchers are converting their Likert Scale to Bipolar Semantic Differential Scale & then use parametric tests because they believe intervals between the bipolar semantic differential scale values can be treated as equal intervals, making it an interval scale which justified for parametric tests.