In 2017, a famous paper proposes Transformer which successfully adopt self attention as well as FFN for NLP. The paper is called "Attention is All You Need". This work lets many scholars know RNN may not be indispensable for NLP. Recently there are also some recent work on CNN in NLP. In speech, RNN, CNN and attention has its own characteristics and Scope of application.
My question is what is the future of a general deep learning model for sequential learning?
What are the differences and connections between different sub domains, such as Speech, NLP, Finance, Transportation and so on?