The biggest advantage of Deep Learning is that we do not need to manually extract features from the image. The network learns to extract features while training. You just feed the image to the network (pixel values).
What you need is to define the Neural Network architecture and a labeled dataset.
In your case you need a set of images and, for each image, you need to know if the person is crying or not.
For the NN architecure, you can choose a standard architecture such as VGG, GoogleNet or ResNet.
You should consider using a network that was trained on a large dataset (ImageNet) and finetune it to your application. This is called Transfer Learning and you can see some examples here: https://keras.io/applications/
I strongly advise you to spend some time learning about Machine Learning/Deep Learning. There are some good online courses (Coursera, CS231N) that will help you get started. Also, I would advise you to read some papers on the subject.
The biggest advantage of Deep Learning is that we do not need to manually extract features from the image. The network learns to extract features while training. You just feed the image to the network (pixel values).
What you need is to define the Neural Network architecture and a labeled dataset.
In your case you need a set of images and, for each image, you need to know if the person is crying or not.
For the NN architecure, you can choose a standard architecture such as VGG, GoogleNet or ResNet.
You should consider using a network that was trained on a large dataset (ImageNet) and finetune it to your application. This is called Transfer Learning and you can see some examples here: https://keras.io/applications/
I strongly advise you to spend some time learning about Machine Learning/Deep Learning. There are some good online courses (Coursera, CS231N) that will help you get started. Also, I would advise you to read some papers on the subject.
As the others two have mentioned that deep learning extract features itself ,try Matlab, for example, Alexnet, pass the images without any feature vector to the Alexnet and you will see at layer F7 you will observe that feature is extracted itself. it's really exciting is nt it. trey it in matlab.
2. Input Image >> Neural Network >> Result [Deep Learning]
From 1 and 2, we can see that neural network can do the both task as feature selection and classification to generate the result. So that means Neural Network box can take features automatically.
For CNN
Input Layer >> Hidden Layers >> Output
Here, input layer takes the input and output gives the desired output. That means hidden layer is doing the feature extractions.
So if you can extract the outputs of the hidden layers, then you can get different features. Those features they are actually considering. Based on data ... it depends.
I think everyone has defined in detailed, the major difference between ML and DL algorithms is that for ML you have to create your own feature vector, while for DL algorithms they extract features automatically in convolution layers.