I was checking the reliability of a 22-item wokplace spirituality scale. The initial value of Cronbach alpha is .81(with 22-items). But if one item is deleted, the value increases to .84. Should I delete this item?
I agree with Sarangapany. All the values and the overall reliability seem to be high enough. If there is no problem in face validity and construct validity, the item to remain is logical.
Your required alpha threshold is .70 and above (Nunnally, 1994). Your initial alpha is now .81 which is sufficiently enough and makes no difference between .84. So, keep all 22 items.
With 22 items, I would be skeptical of any CA that low... a feature of CA is that tends to grow with the number of items in the measure, so althouigh statistically sufficient, this is in my view not a sign of a strong measure ... this probably indicative of multiple dimensions in one instrument... have you tried to run a factor analysis (as CA is absolutely not suited to assess unidimensionality; apart from CA's many other drawbacks, see From Health Measurement Scales A Practical Guide to Their Development and Use. Streiner D.L., Norman G.R. (1989) New York: Oxford University Press (pages 64-65).)? An when you have done so, you can use other reliability statistics (see Bacon et al 1995 for examples, but there are many)...
This scale is already validated scale. It has four dimensions. I had also checked the CA of each individual dimension. All dimensions have CA above .70. I was talking about the CA of all items.
Then why would you bother doing an overall scale reliability, you would have to use seperate dimensions in your analysis... but here goes the same, do a factor analysis... validation can only be assessed post-hoc, meaning that there is no such thing as an on beforehand validated measure, you always have to validate the measures, again and again... what looks as reliability, may be a feature of the data collection (not randomized items for example)... this goes for your data, but also for the data on which the scale has been developed...
If you are following classical test theory, you are assuming that all 22 items are measuring the same construct. I have seen many scales that have an alpha > .70 due to having a large number of items (as mentioned above) but have less than optimal validity (meaning they do not quite measure the construct intended). In addition to examining alpha, I would suggest performing a factor analysis (exploratory or confirmatory, depending on the stage of development of the scale) and examine the factor loadings. Items with low factor loadings should be critically evaluated against the scale construct. Often times you may see that they don't really fit conceptually. Removing such items may not only improve your reliability, but validity as well. Both are important.
We have just published an article about scale purification, i.e. the process of eliminating items from multi-item scales. We have used the example of SCM, but our framework can be applied to any other discipline. Download: https://doi.org/10.1108/SCM-07-2016-0230 (or request via my ResearchGate page).