Hello!

I am using MADAMIRA to identify the part of speech in Arabic data. However, I noticed that due to tokenization, adverbs are never identified and instead switched to nouns or adjectives. For example, the word بسطحية would be tokenized into ب+سطحية which will make سطحية labelled as noun.

Is there anyway I can turn off tokenzation so that I get more accurate POS tagging?

More Reem Al-Kashif's questions See All
Similar questions and discussions