We have a research project in which we will generate about 1TB of stream of data per day.
So for this I wonder if we can use distributed framework to process the data.. if it can be done so apart from the economic reasons, what would be the advantages.