I would like to detect whether a sentence is ambiguous or not using number of parse trees a sentence has. Could anyone help me how to get them either by using NLTK or Stanford Dependency parser.
Thank you verymuch for your reply. I have tried using sanford dependency parser from nltk. Unfortunately the library returns only the best tree as an output but it doesnot give information of how many different trees are possible for a given sentence.
I would like to know the alternative trees possible for a given sentence.
I would like to know the alternative trees possible for a given sentence.
I doubt you can extract such information using the Stanford Parser since the Stanford authors make only such information accessible that is necessary for the users of the parser. Information regarding the number of possible trees is too specific for the Stanford implementation. You can, however, try other options:
1. You can patch the Stanford parser and extract information you need. Python code (stanford.py) is only an interface to the Java engine that starts a Java process in the _execute function. Download the Java source code of the parser and debug it.
2. If you don't want to debug, it is probably easier to compare the parsing trees from different parsers. As others say, Stanford, spacy etc.
3. It is not necessary to stick to parsers. Maybe, you could use taggers for your analysis, for example, the Stanford tagger and the Stanford parser (both in the NLTK as Python interfaces to Java engines). Taggers are rather quicker and maybe sufficient for your analysis.