I have the following project where I need to tag news items with company names to which these news items are relevant to (company names are mentioned in the news items and in many cases, in the headline of the news item). For example: I have about 2000 news items tagged with company names and the relevance level (High/Low) [this is done manually]. I have the following items:
story_ID, Headline; story_Text, company_name; relevance_level
I need to automate the procedure. So I need to tag an incoming news items with company names and their relevance.
Note: 1. some of the news items are not relevant to any company and so these are not tagged.
2. some of the news items are relevant to multiple companies and so these are tagged with multiple company names and their corresponding relevance level.
I am wondering what machine learning algorithms we can use. I am very new to Natural Language Processing. So I am not able to get a handle on how to go about solving the problem. (So far, I have used various techniques in machine learning, but there each row (observation) of the data matrix has only 1 label).
Any help would be greatly appreciated.
Thank you.