As Waldemar said, they have much more hidden layers.
The idea is to add labels to the layers on different layers of abstraction (e.g. face recognition: 1st layer: single pixels; 2nd layer: an edge; 3rd layer: complex forms like an eye; … n-th layer: whole face).
If you want to learn more about deep learning, I can recommend a video lecture from Geoffrey Hinton. He is an absolutely expert on this field:
http://videolectures.net/jul09_hinton_deeplearn/
You can also have a look at his publications, e.g.
As Waldemar said, they have much more hidden layers.
The idea is to add labels to the layers on different layers of abstraction (e.g. face recognition: 1st layer: single pixels; 2nd layer: an edge; 3rd layer: complex forms like an eye; … n-th layer: whole face).
If you want to learn more about deep learning, I can recommend a video lecture from Geoffrey Hinton. He is an absolutely expert on this field:
http://videolectures.net/jul09_hinton_deeplearn/
You can also have a look at his publications, e.g.
Deep learning is most definitely not the end of signal nor image processing. It is just another tool to do signal or image processing that has recently been shown to be very effective if learned carefully.
In essence nothing changed. Most of deep networks are basically MLP trained with backprop. However, what made them popular was :
- a few algorithmic improvements (drop out and regularization)
- the explosion of computing power and parallel processing through GPU
- the huge increase in the size of data available (e.g imagenet vs caltech 101)
But they really are the same concept, just exploited better. And I would tend to disagree when people say that NN are black boxes that we do not understand. It is not true : sure you can just apply them to a bunch of data and pray for good results, but they are just statistical tools. They fit functions by adapting coefficient in an equation to reduce an error measurement. In fine they just try to find some linear separations in the data, and to do so on non linearly separable data, the hidden layer apply transformation so that the transformed input space becomes linearly separable...
Before deep learning was invented around 2006, almost all neural networks researchers cling themselves to the new SVM bandwagon. Two reasons for this: (1) Adding more layers to MLP does not works (e.g., to compete with SVM) due to "diminishing error problem" (i.e., the error from the output layer that was back propagated to the inner layer getting small and small, hence in short the MLP infact does not learn"; (2) Three layers MLP was mathematically explained to be a unversal approximator -- in short there was no reason to add more layers;
Deep learning addresses the first problem (reason) by inventing pre-training (greedy layer wisely learn before move up to the higher layers) and fine-tuning (correct the unsupervisedly learn weights using few labeled data. Deep learning addresses the second reason by offering more abstraction complexity and hierarchical features learning through many of its layers.
The big different between ordinary machine learning technique and deep learning is that the ordinary machine learning usually use hand-crafted features, whereas in deep learning the features are unsupervisedly crafted by machine through its deep structure. Such new hierarchical features lead to the currently best recognition performance of deep learning in many cases.
The main shortcomings, and infact also the advantage, of deep learning is the requirement of big data to unsupervisedly crafting the features.
As with neural networks, I found that we are still lacking a comprehensive theoretical basis to validate the results of deep learning performances. Without sound theoretical basis, it seems the field currently felt more as magic instead of science. I believe we can hope to gain more lights in this aspect.
I summarize and review some of these deep learning algorithms. I hope this would help.
In the last two decades we've "learned" that slight changes of network structure and learning techniques can significantly improve the performance of ANN
Actually, DL was started much earlier, by Alexey Grigorevich Ivakhnenko in 1971. Praphrasing Juergen Schmidhuber in the following paper:
http://arxiv.org/abs/1404.7828
"A paper of 1971 already described a deep GMDH
network with 8 layers (Ivakhnenko, 1971). There
have been numerous applications of GMDH-style
nets, e.g. (Ikeda et al., 1976; Farlow, 1984; Madala
and Ivakhnenko, 1994; Ivakhnenko, 1995; Kondo,
1998; Kord´ık et al., 2003; Witczak et al., 2006;
Kondo and Ueno, 2008)."
Somehow, LeCun, Bengio, and Hinton, overlooked or missed those earlier work to cite in their papers but heavily cite each other works.
Maybe later on, after studying their open responds to the refutation, I may revise my answer again.
Sincerely,
Can anyone help me understand Deep Learning Classifiers..? - ResearchGate. Available from: https://www.researchgate.net/post/Can_anyone_help_me_understand_Deep_Learning_Classifiers [accessed Jul 19, 2015].
In machine learning, users provide a machine with both examples and training data to help the system make correct decisions. This principle is called supervised learning. In other words, in a classical machine learning, a computer solves a large number of tasks, but it cannot form such tasks without a human control.
Diversity between machine learning (ML) and deep learning (DL):
DL requires a lot of unlabeled training data to make concise conclusions while ML can use small data amounts provided by users.
Unlike ML, DL needs high-performance hardware.
ML requires features to be accurately identified by users while DL creates new features by itself.
ML divides tasks into small pieces and then combine received results into one conclusion while DL solves the problem on the end-to-end basis.
In comparison with ML, DL needs much more time to train.
Unlike DL, ML can provide enough transparency for its decisions.
The concept of deep learning implies that the machine creates its functionality by itself as long as it is possible at the current time. To infer, deep learning applications use a hierarchical approach involving determining the most important characteristics to compare.
Deep learning is a kind of traditional machine learning. Classical machine learning is the extraction of new knowledge from a large data array loaded into the machine. Users formulate the machine training rules and correct errors made by a machine. This approach eliminates a negative overtraining effect frequently appearing in deep learning.
In machine learning, users provide a machine with both examples and training data to help the system make correct decisions. This principle is called supervised learning. In other words, in a classical machine learning, a computer solves a large number of tasks, but it cannot form such tasks without a human control.
Diversity between machine learning (ML) and deep learning (DL):
DL requires a lot of unlabeled training data to make concise conclusions while ML can use small data amounts provided by users.
Unlike ML, DL needs high-performance hardware.
ML requires features to be accurately identified by users while DL creates new features by itself.
ML divides tasks into small pieces and then combine received results into one conclusion while DL solves the problem on the end-to-end basis.
In comparison with ML, DL needs much more time to train.
Unlike DL, ML can provide enough transparency for its decisions.
The concept of deep learning implies that the machine creates its functionality by itself as long as it is possible at the current time. To infer, deep learning applications use a hierarchical approach involving determining the most important characteristics to compare.
Deep networks have achieved accuracies that are far beyond that of classical ML methods in many domains including speech, natural language, vision, and playing games. In many tasks, classical ML can’t even compete. For example, the graph below shows the image classification accuracy of different methods on the ImageNet dataset; blue colour indicates classical ML methods and red colour indicates a deep Convolutional Neural Network (CNN) method. .