One way would be through natural language processing or NLP. Basically, you tokenize all the words in the document and remove stop words. For the remaining words, you collect them with similar topics that you could know from dictionary learning or some other model-based technique and then calculate their term frequencies (TF). You could assign scores or weights to determine which topics it belongs to from the predefined classes, or you can use the typical classification problem of machine learning. There are a lot of related works already done in this field, for example, this could be one very simple and starting point for this. Article A practical guide to text mining with topic extraction