Hello everyones,
I'm looking for a way to classify really short pieces of text (max 18 words) into 9 different classes. As a learning set, I've got a dataframe with each sentence (one column) related to its class (second column). My dataframe is 99 634 rows long. I will dive in R language to perform this task.
Do I have to create one Corpus by class ?
What's the correct strategy to build the Term Document Matrix ?
This is my first experience in NLP (categorizing), do you have some advices ?
Many thanks in advance,
Best regards