Dynamics in reinforcement learning, that are represented by the transition function in an MDP, are meant to modelize the probability of reaching (or deriving from) the desired state. From what I understand, this probability is caused by the environement or by a malfunction within the agent(?).
I would like to if there is any real life problem modelized into an RL model with a real life transition probability? I would be super grateful if you redirect me to research papers in this axis.
Thanks in advance.