I have the system(using Java) and human identified the result for 3 datasets. Using this result I did performance evaluation ( with true positive, true negative, false positive and false negative ) like Precision, Recall, Accuracy for 3 different models.

My question is "kappa statistical metric is the correct choice to do this analysis process"? 

or

please tell me some other statistical measure or tools like Weka / rapid miner ( with input format and processing steps) for this analysis. thanks in advance

Similar questions and discussions