Can anyone help with building a pattern using Stanford Tokens Regex?

22 July 2014 2 7K Report

I need to extract all words after the following pattern "/[Ee]ach/ ([tag:NN]|[tag:NNS]) /has|have/ /\\w|[ ]|[,]/" until the end of the sentence but I am getting unexpected output:

in the second sentence I am getting: "Each campus has a" where the right output is "Each campus has a different name, address, distance to the city center and the only bus running to the campus "

in the third sentence I am getting "Each faculty has a " where the right output is " Each faculty has a name, dean and building "

in the fourth sentence the pattern is unable to match the right output which is " each problem has solution, God walling"

It will be appreciate if you could help me in solve this problem, I think that there my pattern has not been written correctly , below is my code

String file="ABC University is a large institution with several campuses. Each campus has a different name, address, distance to the city center and the only bus running to the campus. Each faculty has a name, dean and building. this just for test each problem has soluation, God walling.";

Properties props = new Properties();

props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");

StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

Annotation document = new Annotation(file);

pipeline.annotate(document);

List tokens = new ArrayList();

List sentences = document.get(CoreAnnotations.SentencesAnnotation.class);

for(CoreMap sentence: sentences)

{

for (CoreLabel token: sentence.get(CoreAnnotations.TokensAnnotation.class))

tokens.add(token);

TokenSequencePattern pattern = TokenSequencePattern.compile("/[Ee]ach/ ([tag:NN]|[tag:NNS]) /has|have/ /\\w|[ ]|[,]/");

TokenSequenceMatcher matcher = pattern.getMatcher(tokens);

while( matcher.find()){

JOptionPane.showMessageDialog(rootPane, matcher.group());

}

tokens.removeAll(tokens);

}

How can we use mobile apps for improving students' academic performance?

Can the limit of quantification (LOQ) of an analytical method fall outside its linear dynamic range, or must it always be within it?

EEG Power Spectral Density - Very low frequency on matlab?

What is best method for isolation bacteriophages, for Bacillus?

Is There a correlation for Calculation of Starting torque in plain spherical bearings?

Novel and future lubricants and additives for hybrid electric vehicles?

Which journals similar to Nature Protocols or Genomics Proteomics & Bioinformatics do you recommend?

Mastering the Juggle: How to organize and manage multiple and diverse research projects?

How should I estimate a regression on Elasticity of Sales in response to stokouts ?

Why secondary peak is found in PL spectroscopy?

How to learn more about SPSS and its Application?

Is there a problem with my RNA pellet?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

Request Python code?