Is there any evidence (studies) acknowledging that word embeddings such as fasttext or word2vec significantly improve classification results?

More Alexander Osherenko's questions See All

Why are reaction forces so high when I add vibration to cutting model?

Hi I am trying to model a cutting problem with a blade and comparing the results with and without ultrasonic vibration. From experimental results in our group, the use of ultrasound can...

13 June 2024 7,461 4 View

Best Tools for Wrist PPG Signal Analysis: Your Recommendations?

I'm conducting research on photoplethysmography (PPG) signals obtained from smartwatches worn on the wrist. The main goal is to analyze the PPG waveform and extract key fiducial points and...

13 June 2024 3,286 0 View

Why diagonal terms in Abaqus stiffness matrix are infinite?

I am running some simulations in Abaqus to extract the stiffness matrix of a FE model with geometric nonlinearity. I want to provide to Abaqus a set of nodal displacements (all the dofs in the...

28 May 2024 4,797 2 View

Is it possible to combine a time-series of tree-ring widths and and a time-series of diameter at breast height measurements into one TS?

Hello together! I have some time-series for tree-rings from dating from 1850-2016 and diameter at breast height (DBH) from 1996-2022 for the same plot, however the sampled trees for tree-rings...

16 May 2024 5,295 2 View

Is priori knowledge only instincts? Why?

I think so because The Random Slate Theorem

03 May 2024 3,082 7 View

How to fix driver conflicts of an ancient Skyscan 1072 micro CT with DT3157 video grabber ?

I am currently trying to get a Skyscan 1072 micro CT to work again. After the mainboard on the original PC died, we tried to set the machine up on a new computer. We managed to control the source,...

28 April 2024 381 0 View

Statistics: Is it worth using a t-test on pre and post survey data with 12 or less responses?

Is it worth using a paired t-test on pre and post survey data with 12 or less responses? Or should one rely more on the mean and standard deviation of the two data sets to evaluate the survey...

27 April 2024 8,703 8 View

What about religion, God, multi verse, etc.?

1)Religion(or at least spirituality) calms my mind, particularly, the built in belief of only one reality. God created the only reality. The afterlife is another dimension. We have free will but,...

25 April 2024 10,055 0 View

Handling BrainVision matlab files with ECG data in EEGLAB?

I have 217 preprocessed BrainVision matlab files (.mat) with ECG data in them. I've been having trouble creating a study for these. I know BioSig works for ECG data, but it doesn't seem to work...

21 April 2024 1,028 2 View

EEGLAB - how to import multiple .mat files from different folders all at once?

I have 25 folders for 25 participants, all with .mat files inside that are preprocessed. Most of them have the same amount of .mat files - a couple participants didn't finish the study, so they...

17 April 2024 1,897 3 View

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

I recently came across an anatomy text by Carl Moller that was published in 1915 but it is in German or Dutch neither of which I can understand. I would like to know if there is an English...

10 August 2024 4,347 1 View

How to convert a privately loaded document into a public document?

I attempted to make a privately uploaded text public but a window appeared that said an error occurred. There was no explanation provided as to why there was an error or what might be done to...

05 August 2024 8,025 7 View

How to change the version of the article full-text pdf file?

How to change the displayed full article text to its corrected version? In the file on the page of the journal where I published the article, there was an error in the text, the table is...

30 July 2024 3,229 2 View

Can anyone please provide me the full text article of this clinical Trial?

Roflumilast Cream Improves Signs and Symptoms of Plaque Psor...

29 July 2024 5,250 0 View

Are you looking for research collaboration ?

we have few papers ready for submission, and we need one co-author for each article who can pay article fee. Interested authors may text here or contact me on my following email id [email protected]

29 July 2024 6,626 0 View

How can productivity (using the Google form link below to provide your answers) be achieved in manuscript publication?

Survey on Productivity in Journal Manuscript Publication Survey Form Link: (https://forms.gle/YRVrn8dL4WZJJ79S8 ) Dear Researcher, We kindly invite you to participate in our survey focused on...

29 July 2024 4,116 1 View

Can we convert a thousand of FASTA sequence in numeric form in .csv format? If yes kindly send me the script for the same?

I have a .text file for various FASTA sequence , and i want to convert these sequences into a numeric file which will be in .csv format. OR I want to extract physiochemical properties(features)...

25 July 2024 3,650 2 View

Ideal tissue thickness for multiple immunolabelling and confocal microscopy?

Hi there! I'm hoping to to look at co-localisation of cytokines and markers of different brain cell types (e.g., neurons, astrocytes, microglia) in juvenile rat brains (postnatal day 28) that have...

23 July 2024 2,844 1 View

How to batch extract texts that cite your papers?

Suppose I'm not only interested who and which article cited my paper, but also what's the specific text they mentioned that cited my article. Is there a quick way to batch extract such information?

16 July 2024 9,702 1 View

Daniel Everett versus Noam Chomsky on Language?

Many have criticized Noam Chomsky’s theory of language (e.g., Pinker as described in Sihombing 2022), but the most effective criticisms have come from Daniel Everett, given that Chomsky (according...

15 July 2024 492 4 View

Mohamed Elhadad

Dear Alexander Osherenko,

1- https://towardsdatascience.com/nlp-performance-of-different-word-embeddings-on-text-classification-de648c6262b

2- chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/https://arxiv.org/pdf/1901.09785.pdf

3- Preprint Exploring Word Embedding Techniques to Improve Sentiment Ana...

Muhammad Ali

Have a look at https://medium.com/@tatheerhussain_28261/a-quick-overview-of-the-main-difference-between-word2vec-and-fasttext-b9d3f6e274e9

Article A survey of word embeddings for clinical text

Alexander Osherenko

What is still not clear to me why they calculate results with word embeddings e.g. CBOW although maybe only BOW is sufficient? Why don't they use BOW as a base line?

The BoW based techniques ignores the context.

In BoW you get the words and their count regardless the position of these words in the text, although the position mattees.

For that reason and to overcome this limitation , word

Yes, I am aware of the context. But still, why should I consider the overhead of word embeddings if they don't improve the BOW results. To consider the context, it is possible to use grammar rules.

Abdolmajid Erfani

I recently worked on a research project to classify the "risk type" in construction documents. The paper is under review and when publish I can share that.

Basically, I Compared the following NLP models:

1. TF-IDF + (support vector machine (SVM) with Stochastic Gradient Descent (SGD), Logistic Regression (LR), and Bernoulli Naïve Bayes (BNB))

2. Word2vec + (SVM, LR, BNB)

3. FasTtext + (SVM, LR, BNB)

4. Bidirectional Encoder Representations from Transformers (BERT)

BERT model outperforms other NLP models significantly.

If interested check out:

https://towardsdatascience.com/multi-class-text-classification-with-deep-learning-using-bert-b59ca2f5c613

Yes, please do it. How many corpora have you studied?