I have a reference document template which has the structure of headings and other information such as header (names) and footer(category of the document). I need to comapre the documents if they follow the same structure like the template and classify them based on their category(present in the footer of the document) using machine learning or NLP using python.