I'm looking for a topic for my master thesis. I've read a book recommended by many "Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems" by Martin Kleppmann in a hope that I may get inspired by reading it. The last chapter of the book is dedicated to yet-to-be-solved problems. One of them is designing an unbundled-database equivalent of the Unix shell. This is what the author has to say on the topic (p. 525):
The tools for composing data systems are getting better, but I think one major part is missing: we don’t yet have the unbundled-database equivalent of the Unix shell (i.e., a high-level language for composing storage and processing systems in a simple and declarative way). For example, I would love it if we could simply declare `mysql | elasticsearch`, by analogy to Unix pipes [22], which would be the unbundled equivalent of `CREATE INDEX`: it would take all the documents in a MySQL database and index them in an Elasticsearch cluster. It would then continually capture all the changes made to the database and automatically apply them to the search index, without us having to write custom application code. This kind of integration should be possible with almost any kind of storage or indexing system.
The only research that I've found is that concerning timely dataflow and differential dataflow. I wonder whether there is more research that I'm missing or maybe there isn't any because the problem presented is unimportant or has been solved a long time ago. I'd appreciate your thoughts on the idea of building such a system. Is it needed? Is it going to fill a niche?