I am looking to test item discrimination of my newly constructed psychological well-being scale and would appreciate any references for suggested ranges of poor, good and excellent discriminatory values.
With regard to "total score being contaminated by the less useful items," the most effective way to examine this is via the option "Alpha if item deleted" (or at least that is the name for it SPSS). This helps you identify items that are reducing rather than increasing the value of Alpha.
Along the same lines, it is important to note that Alpha will not tell you whether there is only one scale present. In particular, there might be two (or more) scales that are strongly correlated. Determining the number of scales in a set of items is a job for factor analysis.
I doubt if you are going to find a citation which says that .5 is much more than a traditional cut-off point, and the same is true for the common advice that Cronbach's alpha itself should be at least .7 or .8. In other words, this is more a matter of accepted practice rather than anything that is based on actual statistical principles.
FYI note that an item-to-total correlation of .5 indicates 25% of the variance in that item is shared with the other items in the scale.
I realise that your question seems to focus on item-total correlations, but I am not sure what you mean by "discriminatory values" - which could be something else. Be that as it may, the following text from an article of mine that is currently under review might or might not be helpful:
Clark and Watson (1995) have recommended that interitem correlations should lie between .15 and .50, and Briggs and Cheek (1986) have recommended that the mean of these correlations should ideally lie between .20 and .40. Briggs and Cheek asserted that scales with mean interitem correlations greater than .50 “tend to be overly redundant” (1986, p. 115).
Here are references for the two items that I cited above:
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54, 106–148.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309–319.
I hope that's helpful - but I think I've not addressed your request for information about "discriminatory values".
Inter-item correlation is different from item-total correlation. Higher item-total correlation indicates how well an item discriminates among high and low overall scorers on the test based on the scores on that item. Common sense says that if overall low scorers score high on one item then that item has a poor discriminatory power.
Thanks for getting back, Aleem. I certainly agree: interitem correlations are different from item-total correlations, and I apologise if it seemed I was equating them. That was certainly not my intention.
Now I appreciate what you meant by "discriminatory values", and what you suggest is certainly logical. I do think, however, that item-total correlations might not be as helpful as might be hoped, however. For example, if there are a number of items on a scale that are "off beam", those items will contaminate the total scale score to some extent and will therefore limit valid conclusions being drawn about item-total correlations.
I wonder whether a multi-pronged approach might be best - and it's certainly what I usually do. That includes, inter alia, factor analyses as well as inspecting individual interitem correlations, the mean interitem correlation, and Cronbach's alphas - not necessarily in that order - and with different samples if possible in order to gain some assurance about generalizability.
Of course, you might be doing, or anticipate doing, those kinds of things.
Thanks Robert Trevethan for an insightful response. You are right about total score being contaminated by the less useful items. I will look into other possibilities as you suggested.
Good depends on Researchers. There are different recomendations for item-total correlation in the literature. As a general rule item-total correlation values higher Than .30 considered adequate (field, 2014; nunnaly and berstein, 1994) and many other s. However higher is better always Fİeld (2014)Dİscovering statistics using IBM SPSS new York: Sage
With regard to "total score being contaminated by the less useful items," the most effective way to examine this is via the option "Alpha if item deleted" (or at least that is the name for it SPSS). This helps you identify items that are reducing rather than increasing the value of Alpha.
Along the same lines, it is important to note that Alpha will not tell you whether there is only one scale present. In particular, there might be two (or more) scales that are strongly correlated. Determining the number of scales in a set of items is a job for factor analysis.
I agree with the great comments above, and for the sake of completeness, I want to mention the "correlation with marker items (items that you highly trust) method", which I find very practical.
To respond whether .6 Cronbach alpha is justified we need to understand its interpretation and assumptions. Please refer to the fragment of my recent article ( Article Adaptation and validation of the Polish version of the Belie...
). You will find several useful citations there:
"Some Cronbach’s alpha estimates of internal consistency were below the conventional acceptable threshold [34]. However, as the BMQ-PL includes meaningful content and represents reasonable unidimensionality of the subscales [19], low internal consistency may not be a major barrier to its validity [45]. In fact, other BMQ validation studies have also replicated such questionable Cronbach’s alpha values [19, 25, 46]. Cronbach’s alpha is interpreted as the extent of equivalence of different sets of subscale items that give the same measurement outcomes [47]. Consequently, a low value suggests that different items within a subscale are not closely related to each other and may cover different facets of the same construct. It must be also noted that Cronbach’s alpha may be negatively biased if certain assumptions, such as equal factor loadings of subscale items or non-correlated errors, are violated. As this may be the case in the present study, McDonald’s omega values, which were adequate for the BMQ-PL, appear more robust estimators of internal consistency [34, 48]."
Particularly, see the citations:
45.Schmitt N. Uses and abuses of coefficient alpha. Psychol Assess. 1996;8:350–353.
47.Taber KS. The use of Cronbach’s alpha when developing and reporting research instruments in science education. Res Sci Educ. 2018;48:1273–1296.
48.Trizano-Hermosilla I, Alvarado JM. Best alternatives to Cronbach's alpha reliability in realistic conditions: congeneric and asymmetrical measurements. Front Psychol. 2016;7:769. pmid:27303333
, yes I think you could call the correlation that should lie between .20 and .40 the "inter-item correlation mean", or, perhaps more clearly, the mean of the interitem correlations.
A reference for that is the following:
Briggs SR, & Cheek JM. (1986). The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54, 106–148. https://doi.org/10.1111/j.1467-6494.1986.tb00391.x
What you refer to as Cronbach's alpha is more correctly referred to as coefficient alpha, and it is highly dependent on the number of items on a scale. I'm quite suspicious about what coefficient alpha indicates, particularly when there are more than about 20 items, but with only four items (as you have) I suspect that an alpha of .60 is adequate. I'd be more inclined to look at your interitem correlations, which should probably be in the vicinity of .15 and .50, and preferably > .30. The corrected item-total correlations should preferably be > .30.
If you need references for what I've written in the last paragraph, feel free to ask.
Agree with Robert Trevethan: how big exactly is your Cornbach alpha? Note, that 3 is the minimum recommended number of items in a scale. If you keep minimum advised corrected item-total correlations as well, your Cronbach alpha will be far below minimum. This is because alpha depends on both the number of items and item-total correlations. Also, are your corrected item-total correlations comparable, as should be to make internal consistency estimation justified with Cronbach alpha?
On the other hand, keep Robert Trevethan advise not to adhere to alpha too much. Maybe you can check test-retest reliability measure?
Finally, what is your tool intended for? Research purpose with large sample size? Preliminary reports? If so, do not be disappointed with low reliability. Keep high validity (see "accuracy vs. precision problem"). Acknowledge potential limitations of your research tool discussing it properly.
Remember, that we all use tools in our research. None of them ideally measures what we intend to measure. We just estimate...