Stemming and lemmatization in sentiment analysis?

More Semeh Ben Salem's questions See All

What are the criteria that must be retained for the development and validation of a qualitative NMR method ?

Dear Community, I would like to develop and validate a qualitative NMR method for the analysis of a specific category of chemicals, and my question is the following: What are the criteria that I...

08 July 2024 1,402 4 View

Hot to do MTT assay data analysis?

I have repeated an MTT assay three times using different batches of the same cell line. Each time, I tested the same treatment (different medium) and seeded the same number of cells. Additionally,...

08 July 2024 1,900 1 View

Airborne Geophysics in Mineral Exploration ?

Appreciate your advice as I am asking about Airborne Magnetic Survey for mineral exploration (Gold and Associated Elements). What's the maximum depth for detecting subsurface extension and what...

02 July 2024 6,179 5 View

I need content writers, poets, researchers to my new online magazine, would you be interested?

Good day! I am the CEO of an online magazine called Order Of The Pen . We specialize in promoting new writers via our platform on Wix and Partnership with Mailchimp for email marketing,...

18 June 2024 4,117 0 View

3T3-F442A cells not adhering to coverslip?

Hi, I am attempting to culture 3T3-F442A cells and carry out IF staining. I'm testing a variety of different compounds on the cells (e.g. lopinavir, clozapine, olanzapine etc.) Up until recently,...

16 June 2024 2,754 0 View

How to enumerate mitochondria? and how to isolate them from a single cell suspension?

I am interested in analyzing the effects of some immunotherapy on the number of mitochondria in cancer cells as well as in immune cells. I would like also to isolate mitochondria to use them for...

16 June 2024 9,632 1 View

What is the protocol for fixing plant seeds for histological study?

Which type of stain can be used?

29 May 2024 5,683 2 View

How to plot in Simu5G on Omnet++?

Dear network I need to plot instantaneous and mean End-to-end delay in D2D and eNode-based topologies? How to generate/visualise the accurate vec file data? I appreciate your help

14 May 2024 2,897 1 View

Chapter Invitation: Genome Editing for Sustainable Agriculture Book Series Vol 1 Plant Genome Editing Development and Technologies (Springer)?

Forthcoming Springer book: Genome Editing for Sustainable Agriculture Book Series Vol 1 Plant Genome Editing Development and Technologies (Springer) edited by Khaled F. M. Salem, Jameel M....

08 May 2024 5,043 0 View

What is the best control for conditioned medium?

I'm planning a small research project involving conditioned medium from cancer cells on a human cell line (specifically KGN cells). I intend to expose the cancer cells to this medium for 72 hours...

06 May 2024 8,470 2 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Why does my protein refolded to beta sheet during thermal denaturation analysis?

Hi! So i attempted to understand a novel protein behavior towards heat application by analyzing its secondary structure change. I subjected the protein to a thermal denaturation analysis using...

06 August 2024 1,989 3 View

Abdessamad Benlahbib

I advise you to try this manually (with and without preprocessing techniques) since each dataset has its own characteristics.

These are some papers that have already analyzed the impact of preprocessing for sentiment analysis.

Conference Paper Lecture Notes in Computer Science

https://ieeexplore.ieee.org/abstract/document/7785373/

Article The impact of preprocessing steps on the accuracy of machine...

Semeh Ben Salem

I already implemented the preprocessing part related to stopwords, punctuation, etc but the model provides low validation accuracy (around 0.7) so I wonder that maybe if I also implement the stemming and lemmatization maybe it will increase.

Sur I need to do it manually since I'm working on the tunisian dialect and no already availabe libraries for that.

I don't think that it's a good idea to implement a lemmatizer and a stemmer for Tunisian dialect or for any other dialect since the majority of text doesn't respect the linguistic rules. The Maghreb Arabic dialects are a mixture of Arabic, French and other languages. You can try to collect a huge amount of data and train a BERT Model from scratch.

Would training the BERT model do the same job than a lemmatizer? Would it do the same job? I would appreciate if you give me more details about that.

No, training a BERT model will provide you with word representations that are contextualized, which could lead to good results when it is fine-tuned for other tasks. However, you will need a gigantic amount of data to build the model.

Babatounde Moctard Oloulade

Semeh Ben Salem They are similar except that lemmatization keeps word-related information

such as PoS tags. It will be difficult to answer the question without experimentation as the dataset and the specific task to be solved need to be considered. I advise you to try both and then decide. In case you cannot afford the computation cost of this solution, advise you to use lemmatization as from my experience it brings more impact. Moreover, it's more used by NLP researchers. Once again, you should noticed that everything depends on your data and your specific task.

Deepali Bhardwaj

Does lemmatization identify synonyms and link them to same lemma?