How to find similarity between two set of concepts? Say, I have a set A and set B with their concepts respectively, how to find if these two sets are similar using a mathematical formula.
If the concepts are sorted hierarchically within a taxonomy, you can possibly calculate a similarity score by comparing the distance of each concept in set one with each concept in set two regarding the next common super concept. You can get a simple score, for example, by summing up the lowermost distances. You can consider an adaption of the method in the following paper:
J. Z. Wang, Z. Du, R. Payattakool, S. Y. Philip, and C.-F. Chen, “A new
method to measure the semantic similarity of go terms,” Bioinformatics,
There are several ways to perform this task, you can make an ontology, or, do the comparisons for different types of distances is average, median, mode, neighbors, centroids, with simple or complex Linkages.
Different statistical packages can serve you, whether octave or Matlab, or you can try in R.