I am using the prefmod R package to analyze paired comparison votes.
However, I am not able to measure how reliable the result is, i.e., if there is really a significant difference in the appreciation between the couples, and consequently how is the difference in preference between the first and the last items of the resulting sorted list.
Given a model resulting from a pairwise comparison:
The attached R code simulates the votes of 100 voters on 75 items (2775 pairs). Each voter votes on 90 pairs. The items (item1, ... item75) have a natural order which follows their numbering. The simulation controls how often the voters make the "right" choice (i.e., they prefer the higher number.)
In the model model.09.fit, voters choose 90% of the times the higher item. In the model model.05.fit, voters are very undecided and randomly choose between the two elements of the pair.
How do I detect/measure the "goodness" of the two models? How do I compare the two models to verify that the on model.09.fit come from a more decided crowd?
Thanks and best regards.