I wanted to do a discourse analysis of the text on Twitter, but the data was too large and I had to determine a small range, so I did a random sampling. But the data is still too big. Can I do a word frequency count on the sampled data to find the most frequent words and then randomly select the samples for analysis in details?