I would like to identify certain processes in my study and what leads up to each one. I would like to extract this information from email messages. In order to automate the process, I am using regular expression matching but would like to explore a more robust approach .
I think identifying sample sentences and trying to match its semantics with sentences in email messages would be a good start.
I have been through and ran sample java code for stemming, pos, lemmatizing and similarity. However I am now looking for some basic code which integrates all this
approaches in one program/project. I am hoping it also includes tokenisation and word sense disambiguation as well. Importantly it should have an approach of calculating sentence similarity as well. I am sure some one has done this.
I am particularly looking for java, standford core nlp and wordnet approach which integrates tokenising, stemming, pos, lemmatizing, word sense disambiguation and an approach of calculating sentence similarity as well