Artificial General Intelligence enjoys less attention than other fields of AI. Even within AGI I am not aware of much work that focuses on Natural Language Understanding. I would be interested in any work on the field.
One way to understand this is to see the elements of deep and robotic learning and the arms and legs of AGI, or more correctly, the eyes (image processing), ears and mouth (word processing and natural language) and the body ( robotics, especially new haptic developments). Now, like the scarecrow in the Wizard of Oz, if I only had one brain.
William Vorhies, one of the enthusiastic authors on the subject, offers a practical example of AGI applied as follows: “You don't have to look far for examples of the limitations of today's smarter systems. Take this exchange with Siri. Siri, who was the first president of the United States? Siri: George Washington. Siri, how old was he? Siri: 234 years. This answer because Siri could not recognize the context that I was asking about George and not about the United States. It is to anticipate the context in which our artificial intelligence systems fall apart. ”
This concept has been the center of significant conferences on the topic, and the answers are as numerous as the academic disciplines that study it. But, Vorhies says we can take a shortcut and talk about new reasoning styles (categories): deductive, inductive, and abductive.
Knowledge Representation and the Semantics of Natural Language by Hermann Helbig
An implementation of Helbig's ontology in a very restricted fashion is:
Moldovan, D. I., & Blanco, E. (2012, May). Polaris: Lymba's Semantic Parser. In LREC (pp. 66-72).
Another interesting effort is by Ovhinnikova:
Ovchinnikova, E. (2012). Integration of world knowledge for natural language understanding (Vol. 3). Springer Science & Business Media.
While these are more focused on question/answering systems, they integrate various levels of processing and integration of knowledge databases in a manner that can be scaled and modified to tackle general language understanding.
Related to your query, I would like to suggest you: https://medium.com/huggingface/the-best-and-most-current-of-modern-natural-language-processing-5055f409a1d1
Thank you very much for all your suggestions! If you know some of this research, do they have a way to measure progress? For question answering systems I can imagine that progress can be measured by correct answers given. Also it seems to be useful to start focus on simple language use (like that of children's) and gradually proceed towards more complex language use.
Depends on the system that you are building will be the test setup. Though, for most of these systems you can do a measure of precision and recall (like for Lymba in the reference above). If you do text generation you can do cosine similarity against expected results to see if there is similarity between given and expected. You can also see [2 ] for how the evaluation is done on text summarization and also goes into automated evaluations.
I would not advise starting with simple language but on the resources that are available and that you can leverage to bootstrap your work(unless you are actually merging it with something like event calculus[2] and even then ...). For example, I am using VerbNet, Wordnet, Wikitionary as resources for one of my projects.
[1] Mani, I., House, D., Klein, G., Hirschman, L., Firmin, T., & Sundheim, B. M. (1999, June). The TIPSTER SUMMAC text summarization evaluation. In Ninth Conference of the European Chapter of the Association for Computational Linguistics. (available link in Google Scholar)
[2] Mueller, E. T. (2014). Commonsense reasoning: an event calculus based approach. Morgan Kaufmann.