I want to normalize two linguistic corpora; however, I have no idea whether I need to normalize the corpora per 10,000 words or 1,000,000 words. How can I decide about these two? Why do we use 10,000 and 1,000,000 words, but not 100,000, for example?

Thank you in advance.

More Laya Heidari Darani's questions See All
Similar questions and discussions