Now that the Artificial Neural Networks are able to do almost all what the statistical models can do, What do you think about the future of statistics?
ANNs deliver black-box solutions for prediction/classification problems. This does not help us to build understandable models of the world. Well-designed as-simple-as-possible regression models will remain as important as ever to help us making sense of data - something that prediction/classification tools can not provide.*
And there are still tons of problems that are far easier to tackle with good old regression models rather then with ANNs. And the majority of problems simply does not provide that vast amount of data needed to successfully train ANNs.
Thus, ANNs are one way of using (very complex) regression models, and they jsut complement the toolbox. They won't substitute anything.
---
* Note that ANNs are nothing else but a wildly parametrized (logistic) regression model. So it's nothing really different. It's just the way that it is parametrized (and fitted) that make the difference, and what causes the loss of interpretable coefficients. Spline models are another example of (over-parametrized) regression models used for prediction rather than understanding. But also "normal" regression models can also be used purely for prediction/classification. However, at least they offer the possibility to think what model is best and what the coefficients tell us.
ANNs deliver black-box solutions for prediction/classification problems. This does not help us to build understandable models of the world. Well-designed as-simple-as-possible regression models will remain as important as ever to help us making sense of data - something that prediction/classification tools can not provide.*
And there are still tons of problems that are far easier to tackle with good old regression models rather then with ANNs. And the majority of problems simply does not provide that vast amount of data needed to successfully train ANNs.
Thus, ANNs are one way of using (very complex) regression models, and they jsut complement the toolbox. They won't substitute anything.
---
* Note that ANNs are nothing else but a wildly parametrized (logistic) regression model. So it's nothing really different. It's just the way that it is parametrized (and fitted) that make the difference, and what causes the loss of interpretable coefficients. Spline models are another example of (over-parametrized) regression models used for prediction rather than understanding. But also "normal" regression models can also be used purely for prediction/classification. However, at least they offer the possibility to think what model is best and what the coefficients tell us.
I agree with Jochen, ANN is not a threat to standard statistical approaches.
One thing with ANN is that they have problems with standard levels of replication. With 3 to 30 replicates, ANN isn't very good.
On a personal level, I dislike black-box solutions. Yes, I could go in and look at the decision tree used to arrive at the outcome, but that is generally less satisfying than understanding the system and model it appropriately. In part the issue is that if I understand the system and have a good model then I am more likely to arrive at a good answer given unexpected circumstances whereas the ANN seems to have problems in that regards. ANN has its place, but it won't take over more traditional methods of data analysis any time soon.
No change except for perhaps a name. Google Efron, Tibshirani, and Hastie to see some of the exciting work that is being at Stanford. Best, David Booth
ANNs (and other machine learning models) predict a result on an individual basis
statistical models explain observations on a population basis
.
after all, statistics is all about infering population characteristics from sample measureméts; it is not about predicting the béhaviour of such and such individual of the population ... which does not mean that such predictions are without value; thèse are just two different goals
well, working on both sides, i do see what they have in common and i do know that there is continuum there rather than a "Great Wall" ; however, i believe it is better to describe a continuum by describing both ends and letting people navigate between them !
I think of ANN as a tree based method. In the book link provided by Eugene, page 315 there is a list of benefits and problems with tree based methods. What I am less clear about is exactly where ANN leaves off and machine learning takes over. I think machine learning is a general term while ANN is a specific approach. A genetic algorythm as a form of machine learning but not an ANN.
AFAIK, machine learning is to use computers to fit predictive models -- which can be classical (generalized) regression models, regression trees, stochastic models, ... whatever). There might be heavy data processing, but if the procedure should eventually come up with a classification, the final step involved will be a (multinominal) logistic regression model (or something that is mathematically equivalent to such a model). Please correct me when I am wrong! Happy to learn (human learning, so to say :))
The future of statistics is in the hands of Statisticians. ANN cannot replace traditional statistics. Artificial intelligence remains artificial never Natural. Statistical approaches will always be involved in experimental design and generation of data about which the neural network will be trained.
Statistics has its own philosophy. Statistics will always be remaining significant due to its own philosophy. ANN can never eliminate the philosophy of statistics and hence can never replace traditional statistics. Accordingly, ANN cannot be a threat to statistics.
I do agree with Maurice Ekpenyong that artificial intelligence remains artificial never natural and that statistical approach will always be involved in experimental design and generation of data about which the neural network will be trained.
ANNs are not only used for regression but also for classification or unsupervised learning. The strength point of ANN is, it can use nonlinear functions which allow to deal with multidimensional dependencies in some data.
However, as Professor Jochen Wilhelm said, often, neural networks use linear regression in the final layer.
Eugene, isn't this the paper i linked in my first answer ? 😋
Statistics is indeed concerned with prédictions and explanation but at the level of agregates; prediction at the individual level is somewhat different
what are the factors influencing the unemloyment rate and will it increase for such and such segments of a poulation are standard statistical qiestions ; will that particular person, given her background and a humongous set of data, find a job