As part of any natural language processing model, a good data set is very crucial to compare results. I am looking for a data set for software reviews that is already annotated for software features or quality characteristics.
Is it feasible to use google to parse documents? How to justify how relative these documents are to specific quality characteristic?
Any suggestions about software articles that could be parsed instead?
There are many software reviews in cnet.com but they are not annotated - any ideas?