For Rookies, you can try messing with your data with voyant-tool. Just upload your document and see what happen.
https://voyant-tools.org/
For me, NLP depends on your target. For me, I am mostly working on semantic network and frequency analysis, so tagging is not needed. But if you want to try sentiment analysis or similar things, you may need to learn R or Python.
I am doing this from scrap and a human-based linguistic corpus should be tailored on the task(s). It has few stages of processing the data. These could be elimination of real-world recognition marks for assuring the privacy of subjects (according to GDPR regulations), codification, annotation etc. A corpus (literal meaning was singular in Latin), in general, has a qualitative model of processing, a corpora (plural) could have a quantitative or mixed methods.