Actually, when you are not in a Big data problem, usually there is no best machine learning model. The rank among models are not clear, and depends a lot the way you "treat the data". In Big Data problems, usually deep learning models (intermediate to large Neuron Networks) tend to out perform traditional machine learning (SVM, Ridge Regression, LASSO, others) a lot. In sequential data problems, Dynamic Bayesian Networks or Recorrent Neural Networks works extremely better then traditional machine learning methods.
All this information is available in this two lectures on youtube, this may help you to choose a modeling approach to go further, despise I recommend you to study all of them:
thats hard question you asking. In praxis is many smart AI methods used but most popular is deep learning set of algorithms. They embody most advanced AI methods in one single framework.
are all Deep learning algorithms neural network based? or you can offer a different set of method for Deep learning techniques from statistical or machine learning?
I think, it's depend on our application. I mean, we can not select a special machine learning as the best one. Generally, there are some problems such as parameters setting, multi-classification, imbalanced and overlapping classes that most machine learning methods have some limitation to overcome them. For example SVM as an outperform machine naturally supports binary classification problems. Therefore, several SVM should be used for overcoming this limitation in multi-classification problems, that can lead to the high computational complexity and low convergence. But, studies show that despite all the mentioned limitations, SVM and its modified versions are outperform machines as compared to the other machines. For example, we can say that SVM is definitely better than MLP in binary classification problems. Because as you know, the main objective of both of them is finding a separator. The training phase of MLP will be terminated by finding a possible separator, while the training phase of SVM will continued until the optimum separator is found. It is noted, in large-scale problems, memory capacity and computational complexity are two main challenges that lead to the performance inefficiency of the traditional machines.
actually the question is not about just deep learning, it means to compare statistical methods like bayesian, ML, ... versus nonstatistical methods such as SVM, ...
What makes "Deep" learning deep is lots of layers, so the term is restricted to neural networks. They work well, in the hands of skilled practitioners, but this is a craft not a science. There is no body of theory to guide choices of activation functions, update algorithms, etc, just a body of practice.
Classical statistics is based on a sound mathematical theory. Most methods begin by assuming that the true distribution belongs to a specific class of distributions, e.g. normal, and then make a maximum likelihood estimate of the parameters, e.g. mean and variance. Whether this is a sound approach to your problem depends on whether you have real grounds for the assumption about the distribution.
Machine learning as developed by Kolmogorov's school makes fewer assumptions. They derived bounds for the error between the estimated distribution and the true distribution that do not depend on whether the true distribution is within the set findable by the method used. See Vapnik's publications for details.
Neural networks often work well in practice. But the reasons why they work so well are still mysterious. Taking the question literally, Support Vector Machines "work better in theory" because of the inequalities that Vapnik proved.