Hello!
I am using MADAMIRA to identify the part of speech in Arabic data. However, I noticed that due to tokenization, adverbs are never identified and instead switched to nouns or adjectives. For example, the word بسطحية would be tokenized into ب+سطحية which will make سطحية labelled as noun.
Is there anyway I can turn off tokenzation so that I get more accurate POS tagging?