One of the claims of "Big Data" is to increase the performances of the predictors. Then, to achieve this goal, the right way is: Online learning on the data stream? Bagging techniques on a Hadoop cluster ? Others ?
It is an important question, because the IT tools are really different. In one hand we have a Data Stream Management System and in the other hand we have an Hadoop cluster. We have to choose.