Hello everyones,

I'm looking for a way to classify really short pieces of text (max 18 words) into 9 different classes. As a learning set, I've got a dataframe with each sentence (one column) related to its class (second column). My dataframe is 99 634 rows long. I will dive in R language to perform this task.

Do I have to create one Corpus by class ?

What's the correct strategy to build the Term Document Matrix ?

This is my first experience in NLP (categorizing), do you have some advices ?

Many thanks in advance,

Best regards

More Nathalie Jeanray's questions See All
Similar questions and discussions