What is the difference between "Learning on Big Data" and "Distributed Machine Learning"?

This is a very nice question, and the answer depends on how you define "big data", since there is no perfect agreement on this (it is more of a "buzzword" if you want).

If you define it as "data that cannot be processed by a single computer", then by definition learning from big data requires a distributed solution. In this context you have a lot of research on distributed training algorithms (e.g. distributed SVM [1]) or distributed execution frameworks (e.g. MapReduce, GraphLab, etc. [2-3]).

However, big data can be defined in a huge variety of ways, e.g., you can define it as a set of data that cannot, under any circumstance, be loaded all at once in the memory. In this case, you can face several problems which do not concern anything distributed.

For example, online learning algorithms overcome the limitation by processing a handful of examples at a time, and here the research is on how to efficiently handle large amounts of data without worsening performance. For instance, in online kernel-learning there is the problem of the linear growth of the model [4]. Another example is active learning, where you want to select the most useful patterns from a huge database of patterns that cannot be processed all together. Or you can have "randomized" approximations to make the optimization problem simpler and possible to solve even in the presence of big data [5].

So as you see, "distributed ML" and "big data ML" are related but actually distinct concepts.

[1] Navia-Vázquez, Angel, et al. "Distributed support vector machines." Neural Networks, IEEE Transactions on 17.4 (2006): 1091-1097.

[2] Chu, Cheng-Tao, et al. "Map-reduce for machine learning on multicore." NIPS. Vol. 6. 2006.

[3] Low, Yucheng, et al. "Graphlab: A new framework for parallel machine learning." arXiv preprint arXiv:1006.4990 (2010).

[4] Singh, Abhishek, Narendra Ahuja, and Pierre Moulin. "Online learning with kernels: Overcoming the growing sum problem." Machine Learning for Signal Processing (MLSP), 2012 IEEE International Workshop on. IEEE, 2012.

[5] Rahimi, Ali, and Benjamin Recht. "Random Features for Large-Scale Kernel Machines." NIPS. Vol. 3. No. 4. 2007.

Simone Scardapane

This is a very nice question, and the answer depends on how you define "big data", since there is no perfect agreement on this (it is more of a "buzzword" if you want).

So as you see, "distributed ML" and "big data ML" are related but actually distinct concepts.

[1] Navia-Vázquez, Angel, et al. "Distributed support vector machines." Neural Networks, IEEE Transactions on 17.4 (2006): 1091-1097.

[2] Chu, Cheng-Tao, et al. "Map-reduce for machine learning on multicore." NIPS. Vol. 6. 2006.

[3] Low, Yucheng, et al. "Graphlab: A new framework for parallel machine learning." arXiv preprint arXiv:1006.4990 (2010).

[5] Rahimi, Ali, and Benjamin Recht. "Random Features for Large-Scale Kernel Machines." NIPS. Vol. 3. No. 4. 2007.

Feedback defines the constitution of an organism?

How to learn more about SPSS and its Application?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Do you know best mines of western part of Afghanistan?

Separation of organic acids-HPLC?

Is Galaxy.org good to use for research for analyzing data and for publication?

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

Measuring the Intelligence of a Species?