I am using GloVe for the first time and I've discovered that some words are present both alone and with punctuation signs. For example, all the following tokens are present in GloVe:
What is the best practice? Should I separate the words from the symbols use them as two tokens or use them together as one single token?
Moreover, I've found that some common expressions are present too:
Same question: should I split these expressions into different tokens or use them as a single one?