as you know Cronbach's alpha should be above 0.7 to be acceptable. I was wondering if a construct's alpha is 0.68 which is very close to 0.7, we could still claim this value is acceptable? If so, can anyone recommend a reference to me?
it depends. The measure of Cronbachs Alpha (CA) is very susceptible to the number of items (I think it is shown in Cortina, 1993). Meaning that lower CA might be acceptable, when you have only a few items (e.g. two or three). An additional problem might be different item difficulties. If you have some very "easy" and some very "difficult" items in your scale, the CA underestimates the "real" reliability (I have only a German source for this, but I think you can find it in books about test theory). That is, check whether the means of all items are close together or vary across items extremely.
If you have many items in the scale (e.g. 20), it might be useful to exclude items, which have a very low item-total-correlation.
Nunally (1978, p. 245) considers a reliability coefficients of .70 as sufficient. However, you have to attend at the number of items (for a small number – e.g., 3 or 4 can be acceptable) and if the items very different in their content (see the intercorrelations values – are they low or high?). Cronbach’s alpha is too sensitive to number of items; furthermore, see if you deleting any of them would lead to a significant reduction of internal consistency of this dimension (if no, try to eliminate the items that reduce internal consistency). See also the mean for inter-item correlation. Briggs and Cheek (1986) recommend that “The optimal level of homogeneity occurs when the mean inter-item correlation is in the .2 to .4 range” (p. 114), and Clark and Watson (1995) that “(…) the average interitem correlation fall in the range of .15-.50” (p. 316).
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the evaluation of personality scales. Journal of Personality, 54, 106-148.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale
Nunally (1978, p. 245) considers a reliability coefficients of .70 as sufficient. However, you have to attend at the number of items (for a small number – e.g., 3 or 4 can be acceptable) and if the items very different in their content (see the intercorrelations values – are they low or high?). Cronbach’s alpha is too sensitive to number of items; furthermore, see if you deleting any of them would lead to a significant reduction of internal consistency of this dimension (if no, try to eliminate the items that reduce internal consistency). See also the mean for inter-item correlation. Briggs and Cheek (1986) recommend that “The optimal level of homogeneity occurs when the mean inter-item correlation is in the .2 to .4 range” (p. 114), and Clark and Watson (1995) that “(…) the average interitem correlation fall in the range of .15-.50” (p. 316).
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the evaluation of personality scales. Journal of Personality, 54, 106-148.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale
Nunally (1978, p. 245) considers a reliability coefficients of .70 as sufficient. However, you have to attend at the number of items (for a small number – e.g., 3 or 4 can be acceptable) and if the items very different in their content (see the intercorrelations values – are they low or high?). Cronbach’s alpha is too sensitive to number of items; furthermore, see if you deleting any of them would lead to a significant reduction of internal consistency of this dimension (if no, try to eliminate the items that reduce internal consistency). See also the mean for inter-item correlation. Briggs and Cheek (1986) recommend that “The optimal level of homogeneity occurs when the mean inter-item correlation is in the .2 to .4 range” (p. 114), and Clark and Watson (1995) that “(…) the average interitem correlation fall in the range of .15-.50” (p. 316).
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the evaluation of personality scales. Journal of Personality, 54, 106-148.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale
Nunally (1978, p. 245) considers a reliability coefficients of .70 as sufficient. However, you have to attend at the number of items (for a small number – e.g., 3 or 4 can be acceptable) and if the items very different in their content (see the intercorrelations values – are they low or high?). Cronbach’s alpha is too sensitive to number of items; furthermore, see if you deleting any of them would lead to a significant reduction of internal consistency of this dimension (if no, try to eliminate the items that reduce internal consistency). See also the mean for inter-item correlation. Briggs and Cheek (1986) recommend that “The optimal level of homogeneity occurs when the mean inter-item correlation is in the .2 to .4 range” (p. 114), and Clark and Watson (1995) that “(…) the average interitem correlation fall in the range of .15-.50” (p. 316).
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the evaluation of personality scales. Journal of Personality, 54, 106-148.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale
Nunally (1978, p. 245) considers a reliability coefficients of .70 as sufficient. However, you have to attend at the number of items (for a small number – e.g., 3 or 4 can be acceptable) and if the items very different in their content (see the intercorrelations values – are they low or high?). Cronbach’s alpha is too sensitive to number of items; furthermore, see if you deleting any of them would lead to a significant reduction of internal consistency of this dimension (if no, try to eliminate the items that reduce internal consistency). See also the mean for inter-item correlation. Briggs and Cheek (1986) recommend that “The optimal level of homogeneity occurs when the mean inter-item correlation is in the .2 to .4 range” (p. 114), and Clark and Watson (1995) that “(…) the average interitem correlation fall in the range of .15-.50” (p. 316).
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the evaluation of personality scales. Journal of Personality, 54, 106-148.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale
Nunnally, J. (1978). Psychometric theory. New York: McGraw-Hill.
Good luck with your research!
Is 0.68 alpha value is acceptable? - ResearchGate. Available from: https://www.researchgate.net/post/Is_068_alpha_value_is_acceptable#view=572346c893553b867d377dc1 [accessed Apr 29, 2016].
There are many factors affecting the reliability of a test.
1. Reliability is a property of test score data not tests. I use to give my students a problem. I have two tests, one (A) has a reliability of 0.92, the other (B) 0.09. Why are they different? Some would say B is too short, or too hard, or measures more than one thing, or the sample size was too small. The two tests measured the same thing, were the same length, had the same average difficulty and were both tested on about 300 people. The answer was that they were the same test. I split the score distribution into quartiles, formed one sample from the top and bottom, the other from the middle quartiles. It was the variation (or lack therof) that gave the different alphas.
2. SPSS has as a default coefficient of reliability, an index called coefficient alpha (Cronbach, 1951). It is not the best index. So why is it always used? Revelle & Zinbarg (2009) suggest that
"[i]t has been known for a long time that α is a lower bound to the reliability, in many cases even a gross underestimate, and a poor estimate of internal consistency and in some cases a gross overestimate, but it continues to be used. Why is this? Perhaps inertia on the part of editors and reviewers who insist on at least some estimate of reliability and do not know what to recommend. Perhaps inertia on the part of commercial program to implement features that are not widely requested. And perhaps it is the fault of psychometricians who develop better and more powerful algorithms, but do not make them readily available." [p.153]
All indices of reliability are estimates and are what is known as lower-bound estimates (meaning they are pessimistic). The most accurate estimate is there the highest one. In 1945, Louis Guttman wrote an article outlining six different ways of estimating reliability (Guttman, 1945). His solution was to take the highest. These did not become routine at the time because some of his estimates required a computer for calculation. Cronbach’s alpha however could be calculated by hand and became the standard. There are a number of alternatives [see for example Revelle & Zinbarg, 2009] to Coefficient Alpha [including Guttman’s estimates] which are always greater than coefficient Alpha and should be used in preference. For example Sijtsma (2009) shows that Guttman’s λ2 is always equal to or larger than Alpha. Revelle & Zinbarg (2009) show how you can calculate some better coefficients from factor analysis output but for simplicity’s sake, Guttman’s coefficients are available in SPSS reliability as MODEL = GUTTMAN. Use the largest one as your estimate of reliability. If you have access to a recent version of SPSS you can add a Spanish R package which gives some of the newer forms of reliability estimation.
I know I haven't actually answered your question - but I hope I've shown (a) why you might have the problem, and (b) what you might do about it.
Good luck.
Guttman, L. (1945) A basis for analysing test-retest reliability. Psychometrika, 10, 255-282.
Revelle, W., & Zinbarg, R.E. (2009). Coefficients alpha, beta omega, and the glb: Comments on Sijtsma. Psychometrika, 74, 145-154.
Sijtsma, K. (2009) On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107-120.
Notwithstanding Richard's comments above, it is a myth that Nunnally (1978) said reliability coefficients of .70 and above were acceptable. This is the oft-cited passage from Nunnally (1978, pp. 245-246) in full; he states that
"what a satisfactory level of reliability is depends on how a measure is being used. In the early stages of research... one saves time and energy by working with instruments that have only modest reliability, for which purpose reliabilities of .70 or higher will suffice... In contrast to the standards in basic research, in many applied settings a reliability of .80 is not nearly high enough. In basic research, the concern is with the size of correlations and with the differences in means for different experimental treatments, for which purposes a reliability of .80 for the different measures is adequate. In many applied problems, a great deal hinges on the exact score made by a person on a test... In such instances it is frightening to think that any measurement error is permitted. Even with a reliability of .90, the standard error of measurement is almost one-third as large as the standard deviation of the test scores. In those applied settings where important decisions are made with respect to specific test scores, a reliability of .90 is the minimum that should be tolerated, and a reliability of .95 should be considered the desirable standard."
Tl;dr: you could get away away with a modest reliability of .70 or thereabouts if you are trying to "save time and energy" in a new area of research (e.g., if you have designed a novel scale to measure a particular construct); otherwise, you really should apply the more conservative cut-off of .80. Incidentally, Carmines and Zeller (1979) made a similar recommendation in their seminal text on reliability and validity.
References
Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment. Newbury Park, CA: Sage.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
Dear Koroush, I strongly agree with Viren Swami. Far too many "Social and Behavioral Scientists (SBS)" accept an r = 0.68. This confirms too much error and documents < 50% explained variance. Please define the primary purpose of the "scale"?
I have taught at the graduate level at Schools of Public Health and apply a standard for scales r = > 0.80 and Items r > 0.30 for all of my applied health and behavior evaluations. I need more information about the sample size of your study, representativeness, number of items, Inter-Item correlation coefficients. How stable is the Internal Consistency r: Test + Re-Test with 90%+ of the same sample? Without these data/information, I cannot give you good advice about the next Steps about the Scale or Items in your evaluation research. I recommend setting high standards and being patient. Take Care.
Richard
Richard Windsor, MS PhD MPH, Professor Emeritus
George Washington U. School of Public Health and Health Services