I'm current doing my Msc on AI and for my dissertation I need to do some information extraction from various random news sites and extract articles for sentiment analysis. Are there any good data sets on the web that have been labeled with raw markup as the input and then the article as the label?
Ideally would be nince if there was some sort of title extraction aswell but this is not important.
If not any recomendations on what i could use dataset wise for this?
Thanks