I need to extract all words after the following pattern "/[Ee]ach/ ([tag:NN]|[tag:NNS]) /has|have/ /\\w|[ ]|[,]/" until the end of the sentence but I am getting unexpected output:

in the second sentence I am getting: "Each campus has a" where the right output is "Each campus has a different name, address, distance to the city center and the only bus running to the campus " 

in the third sentence I am getting  "Each faculty has a " where the right output is " Each faculty has a name, dean and building "

in the fourth sentence the pattern is unable to match the right output which is " each problem has solution, God walling"

It will be appreciate if you could help me in solve this problem, I think that there my pattern has not been written correctly , below is my code

String file="ABC University is a large institution with several campuses. Each campus has a different name, address, distance to the city center and the only bus running to the campus.  Each faculty has a name, dean and building. this just for test each problem has soluation, God walling.";

  Properties props = new Properties();

  props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");

  StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

  Annotation document = new Annotation(file);

  pipeline.annotate(document);

  List tokens = new ArrayList();

  List sentences = document.get(CoreAnnotations.SentencesAnnotation.class);

  for(CoreMap sentence: sentences)

   {          

    for (CoreLabel token: sentence.get(CoreAnnotations.TokensAnnotation.class))

            tokens.add(token);

    TokenSequencePattern pattern = TokenSequencePattern.compile("/[Ee]ach/ ([tag:NN]|[tag:NNS]) /has|have/ /\\w|[ ]|[,]/");

    TokenSequenceMatcher matcher = pattern.getMatcher(tokens);

    while( matcher.find()){

        JOptionPane.showMessageDialog(rootPane, matcher.group());

     }

     tokens.removeAll(tokens);

   }

More Mussa Omer's questions See All
Similar questions and discussions