I want to find out to what extent annual reports contain textual information that corresponds to a reference document, such as the Paris Agreement on carbon dioxxide emission. The background question is to what extent companies follow such a reference document in their disclosure. I though of working with a document-feature matrix and then using a dictionary derived from the reference document. This can be done in quanteda. The problem is that making such a dictionary is quite arbitrary. Is there not a better way?

More Johannes WH van der Waal's questions See All
Similar questions and discussions