Having obtained two set of data from 1st and 2nd trial test, use Pearson correlation. In doing this, code all the data obtained from the first trial on a person-by-item matrix, obtain the sum of each individual across all the items. Do same for the second trial data. you now have two set of data, correlate to obtain a single score for the entire instrument, rather than by items.
However, just as David Morse explained above, you can still correlate by items. Using this approach implies that, if there are 30 items in the questionnaire, you will obtain 30 reliability estimates which tells you which items were internally consistent and which were not. I hope this helps
Spearman’s r and median values are presented for all scale and frequency variables. Scale and frequency variables are also ranked into quartiles for calculating Cohen’s kappa . For categorical variables only Cohen’s kappa is calculated . The two correlational measures are used to show both similarity in the sum of responses to an item on test and retest.Large correlation coefficients, defined as 0.5 or greater, indicates that the reliability is high . The value of Kappa, identifies the strength of agreement.I hope this helps.
Thank you all for your kind reply. But unfortunately I couldn't benefit from all above answers !!!
I have a 2 set of data for same questionnaire for a small sample size. I want to check test re test reliability of this data, so what will be the best statistical test? I can do pearson correlation for each question (1st trial and 2nd trial) but is this method accurate or do I need to do a test that give me an overall measure of test reliability?
If you believe that the questionnaire items form a unidimensional scale, then estimate test-retest reliability by computing the correlation (Pearson or Spearman) between total scores on occasion 1 with those from occasion 2. As Thom Baguley indicates, that is the classic method for test-retest reliability estimation.
If you are not convinced that a total score makes sense, and instead want to check response consistency at the item level, then you could use Cramer's V (derivable from chi-square statistic), Cohen's kappa (which corrects for chance-level agreement), Spearman correlation (if ordered response scale used), or more strictly, the percent of respondents who give the very same response on both occasions (percentage agreement).
Having obtained two set of data from 1st and 2nd trial test, use Pearson correlation. In doing this, code all the data obtained from the first trial on a person-by-item matrix, obtain the sum of each individual across all the items. Do same for the second trial data. you now have two set of data, correlate to obtain a single score for the entire instrument, rather than by items.
However, just as David Morse explained above, you can still correlate by items. Using this approach implies that, if there are 30 items in the questionnaire, you will obtain 30 reliability estimates which tells you which items were internally consistent and which were not. I hope this helps