What is the best current method for the semantic similarity search between two sentences in the state of the art and what is its position with respect to words embeddings for the synonym search.
Thank you for the answer. I am interested in the WE. Do you know a French medical dataset available to allow me to find synonyms or similar sentences that I have in a small Q&A dataset in order to enrich it?
It is possible to define an undirected graph by a covering set of its cliques. I've assembled such a graph. Subsequent nonempty lines are a clique of words (symbols).
http://www.augos.com/ki/semnet_en.html
In order to determine the semantic distance between two words I don't recommend Dijkstra but a random walk, since it better exploits the whole network, whereas shortest paths go through the shrubbery like neutrinos through butter.
The random walk assigns a visit count to roughly each word of the graph. So it's possible to apply fuzzy logic. For example, if you are searching for 'universe' 'origin' (non-keyword omitted) and the sentence to be evaluated contains 'cosmos' 'birth', you will find cosmos as the best match for universe and birth as best match for origin. You would then take the minimum of both random walk visit counts (note: you would make two random walks, one for universe and one for origin). The minimum, because here an AND is required. For OR you would take the maximum as an evaluation of the match.