I can't find many real world applications of reinforcement learning. Also I wonder if there are clear criterias under which reinforcement learning outperforms optimal control regarding performance, stability, costs, etc.
See the following articles to explore the applications of reinforcement learning.
1)Kiumarsi, B., Vamvoudakis, K. G., Modares, H., & Lewis, F. L. (2017). Optimal and autonomous control using reinforcement learning: A survey. IEEE transactions on neural networks and learning systems, 29(6), 2042-2062.
2) Polydoros, A. S., & Nalpantidis, L. (2017). Survey of model-based reinforcement learning: Applications on robotics. Journal of Intelligent & Robotic Systems, 86(2), 153-173.
3) Mahmud, M., Kaiser, M. S., Hussain, A., & Vassanelli, S. (2018). Applications of deep learning and reinforcement learning to biological data. IEEE transactions on neural networks and learning systems, 29(6), 2063-2079.
4) Nian, R., Liu, J., & Huang, B. (2020). A review on reinforcement learning: Introduction and applications in industrial process control. Computers & Chemical Engineering, 106886.
5) Lei, L., Tan, Y., Zheng, K., Liu, S., Zhang, K., & Shen, X. (2020). Deep reinforcement learning for autonomous internet of things: Model, applications and challenges. IEEE Communications Surveys & Tutorials, 22(3), 1722-1760.
6) ...
Also read the book below for a better comparison.
Lewis, F. L., Vrabie, D., & Syrmos, V. L. (2012). Optimal control. John Wiley & Sons.
For tasks that are complex and difficult to formulate by the classic tools of optimal control, we are forced to orient ourselves towards automated agents in a very generic programming framework and to solve these problems in a simpler way is learning by reinforcement (LM). We can conclude that this technique is a plus for optimal control.
For more information about this subject i suggest you to see links attached files on topic.
Classical methods for control of dynamical systems require complete and exact knowledge of the system dynamics. However, most real-world dynamical systems are uncertain and their exact knowledge is not available. Adaptive control theory consists of tools for designing stabilizing controllers which can adapt online to modeling uncertainty of dynamical systems and has been applied for years in process control, industry, aerospace systems, vehicle systems, and elsewhere. However, classical adaptive control methods are generally far from optimal. On the other hand, optimal control theory is a branch of mathematics developed to find the optimal way to control a dynamical system. Reinforcement learning is actually closely tied theoretically to both adaptive control and optimal control. One can see RL methods as a direct approach to adaptive optimal control of dynamic systems. See the below link for more details:
R. S. Sutton, A. G. Barto and R. J. Williams, "Reinforcement learning is direct adaptive optimal control," in IEEE Control Systems Magazine.
This one provides an overview of the reinforcement learning and optimal adaptive control literature and its application to robotics:
Khan, S. G., Herrmann, G., Lewis, F. L., Pipe, T., & Melhuish, C. (2012). Reinforcement learning and optimal adaptive control: An overview and implementation examples. Annual Reviews in Control, 36(1), 42–59.
well said by the above researchers i.e. the most important point keeping in mind is the unknown knowledge of the system dynamics or exact model of a system, some examples of RL can be seen:
Article Online optimal and adaptive integral tracking control for va...
Reinforcement learning and optimal control theory use same principles. In fact, they are the same thing, although with some differences.
In both of them you calculate an optimal control (called policy in RL literature), based on a given objective function (called reward in RL literature) for a dynamic system. In that they are the same things and share lots of tools and techniques.
The main difference between them is that optimal control theory is used when you have a mathematical model for your dynamic system. However, RL is used mostly when you do not have mathematical model for your dynamic systems.
To give an example, for controlling rigid body robots we have practically accurate dynamics model for them. So we usually use (approximate) nonlinear optimal control theory to derive optimal control for them.
But for playing a game like chess or controlling a soft robot, we do not have a set of mathematical relations accurately describing their evolution or the models are too complex. In this case we use model free optimal control, that is called RL, to calculate the optimal policy maximizing the reward.
The take away is that
If you have a (simple enough) mathematical model of dynamic system you dealing with, you might use optimal control theory.
But if you do not have exact dynamic model or if it is too complex, then you might use RL.
That being said RL can be used in any case, but there is no advantage if you have the dynamic model.
Farshid Asadi Thanks for your answer. I just want to add, that also model-based RL exists (e.g. world models), which makes it difficult to differentiate RL from optimal control and also to decide which one to use.