I am wondering if there is a research paper that considers the ratio of unstructured text over the web and whether it is the cause for rapid increasing in data on the web? What is the responsible data resource for the rapid in increasing in web data? Is this the unstructured data (text)? Is there research paper talking about this issue?
Thank you very much.