Dynamics in reinforcement learning, that are represented by the transition function in an MDP, are meant to modelize the probability of reaching (or deriving from) the desired state. From what I understand, this probability is caused by the environement or by a malfunction within the agent(?).

I would like to if there is any real life problem modelized into an RL model with a real life transition probability? I would be super grateful if you redirect me to research papers in this axis.

Thanks in advance.

More Soumia Mehimeh's questions See All
Similar questions and discussions