25 March 2018 4 575 Report

I want to implement Deep Retinal Convolution Neural Network for Speech Emotion Recognition given in this paper https://arxiv.org/ftp/arxiv/papers/1707/1707.09917.pdf. The authors of this paper achieved 99% accuracy on IEMOCAP, EMO-DB databases.

What I understood from this paper is that I first have to convert voices in to spectogram by using Data Augmentation Algorithm Based on Retinal Imaging Principle (DAARIP) algorithm and then input these into DCNN.

I am having a hard time breaking down this approach in to easy steps.

More Saad Obaid's questions See All
Similar questions and discussions