21 September 2020 3 5K Report

Hi, I am quite a newbie with python, and I need to run some text mining analysis on 100+ literary texts in German, which I have stored as individual txt files in a folder. They are with the scheme author_title_date (for example "schnitzler_else_1924.txt").

I was thinking of using the python package nltk and/or spaCy, and maybe the Stanford NER, as I need to analyse sentiments in the different texts and to identify specific locations as well as the sentiments in relation to such locations.

I am stuck on a very preliminary passage though: how do I import the all the text files from the folder to a single corpus/vector corpus that retains the metadata in the title? I could relatively easily produce that in R with TM, but I can't find a way to do it in python. Thanks!

More Giulia Grisot's questions See All
Similar questions and discussions