Tokenization using MADAMIRA erases adverbs?

23 July 2019 1 8K Report

Hello!

I am using MADAMIRA to identify the part of speech in Arabic data. However, I noticed that due to tokenization, adverbs are never identified and instead switched to nouns or adjectives. For example, the word بسطحية would be tokenized into ب+سطحية which will make سطحية labelled as noun.

Is there anyway I can turn off tokenzation so that I get more accurate POS tagging?

Mai Magdy M. Sleim

Hi Reem, i am not sure if this would help as i haven't used any of these tools. but what if you can make a rule for the prog. whenever there would be a preposition ب followed by a noun, it would be tagged as an adverb? i guess there would be no exception. anyways, manual revision is crucial and you can re-evaluate the results.

Badges
Science topic

More Reem Al-Kashif's questions See All

Are air moisture harvesting technologies effective in combating desertification?

Air moisture harvesting Air water collection devices

06 August 2024 5,473 2 View

Dirty and clean?

Hi everyone I need a file with a dirty and clean potato image

04 August 2024 7,199 4 View

Can anyone provide me with molecular docking softwares/ websites?

Molecular docking software/ websites?

02 August 2024 8,704 7 View

Can we patent a process flow diagram developed using a process simulator but no actual cases is carried out?

Can we patent a process flow diagram developed using a process simulator but no actual cases is carried out? For example consider a process for certain product manufacture where a new process flow...

31 July 2024 781 1 View

Gas chromatography RT detection?

I am working on algal extract to which gas chromatography (Not GC-MS) spectrum I want to discover. My question is can we identify specific compounds using retention time if I compared the RT with...

29 July 2024 8,034 4 View

PhD thesis topic?

I want to write a topic for my PhD thesis in hospitality (hotels), can u please suggest some variables

29 July 2024 9,058 3 View

I want to buy Hydrothermal Synthesis Autoclave from any European company. Can anyone suggest any company inside Europe?

Thanks

22 July 2024 1,143 3 View

Can anybody provide me the Matlab code to plot the attached picture (Time-Frequency Domain), please?

Time-Frequency Domain

19 July 2024 8,031 2 View

Who wants opportunities for scientific cooperation?

Dear Colleagues, I hope this message finds you well. My name is Noor Al-Huda K. Hussein,and I am a researcher specializing in deep learning applications in genetic data analysis. I am currently...

18 July 2024 5,562 0 View

Who wants opportunities for scientific cooperation?

Dear Colleagues, I hope this message finds you well. My name is Noor Al-Huda K. Hussein, and I am a researcher specializing in deep learning applications in genetic data analysis. I am currently...

16 July 2024 3,981 6 View

Are there any commercially available Donkey anti-Alpaca secondary antibodies?

Are there any fluorescently labeled anti-Alpaca secondary antibodies raised in Donkey? So far I have only been able to find anti-Alpaca secondaries raised in Goat. Or is this not possible due to...

04 August 2024 4,255 1 View

Broca’s area must be intact for the learning of new movement sequences?

When the eyes of a person are damaged this causes complete blindness. Likewise, when Wernicke’s and Broca’s areas of neocortex are damaged this causes complete aphasia, losing the ability to...

01 August 2024 6,744 2 View

What is the relationship between protein structure and N or C terminal tagging choosing?

I want to do 2,3-butanediol dehydrogenase(BDH) enzyme purification to confirm its activity for 2,3-butanediol. Before that, I need to confirm which N or C terminal tagging is better for enzyme...

28 July 2024 366 3 View

Can I use Polyjet after its expiration date?

I have a Polyjet that has passed its labeled expiration date for 1 year and I'm going to re-start transfection experiments. It is really needed to change it? Transfection efficiency may be lower?

23 July 2024 8,059 1 View

Why is there a significant edge deviation in radar point cloud and camera registration?

The above are manually labeled extrinsic matrices based on the first image It can be seen that the projection error at the edge is large, while the error at the center is small. What could be the...

23 July 2024 7,479 3 View

How is surfactant addition calculated?

I want to make 100 ml of MWCNT/distilled water mixture at a volume fraction of % 0.1. MWCNT density is 2100 kg/m3. I will add gum arabic as surfactant. If I add surfactant at a 1:1 ratio, will the...

19 July 2024 8,494 0 View

How to label synapses in over-fixed mice brain sections (40 um) via immunohistochemistry (IHC)?

We have mice brains that were over-fixed due to old PFA used during perfusion. Thus, the synapses are no longer being labeled by the Synaptophysin (SY38) mAB. which works perfectly every other...

17 July 2024 7,767 3 View

The criterion about maximally localized wannier function?

I heard that as the value of "num_iter(tag in wannier 90)" is higher, spread of Wannier function(=WF) is gradually lower in wannier 90. If so, is this procedure that minimize the spread of WF...

13 July 2024 4,608 0 View

Increasing Area Ratio on Quality control Standard on LCMS ?

Hello everyone. During my analyses on LC MSMS, the area ratio (Standard Area/Isotopically labelled Standard Area) increases with time for the quality control point. I'm having trouble explaining...

08 July 2024 3,150 2 View

What should I keep in mind when designing a gene construct when trying to co-transform to increase protein expression?

I am having trouble expressing a protein and have heard that co-transforming it can improve stability and increase expression, so I am looking to try this experiment. I only need the kinase domain...

08 July 2024 2,935 5 View