I am planning to develop a Q&A system to chat within a limited context of topics. My data is limited so I will develop semi-supervised and supervised systems. But want to know about state-of-art systems and potentially successful approaches.
First of all you have to develop Q&A Data in a sufficient manner. After collect the data design a system which will collect randomly Question from your database and see the accuracy of selecting the Question from the database. Note that the Q&A on the topic should be in group and it should selected randomly by the system. Get the accuracy of selecting the Q&A from the database.
I aim to obtain a simple dataset of questions from real life(avg 5-6 words per sentence) with a limited context. Because I am not a semanticist, baseline of the model would be limited to morphology/syntax interface but I will definitely move towards pragmatics. At the baseline I want to use a tagset for annotating questions and train a deep learning model(depending of the training data size).
Should I develop my own tagset or use standardized one? is the first of the questions.
Secondly, I have knowledge of Bayesian learning but I think this project will require more knowledge on statistical QA knowledge even n-ary systems may not be enough.
@Bahadorezza I would definitely like to read your research.