MDP is a discrete-time stochastic control process, providing a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision-maker [Source: Wiki]. They are used in the areas of optimal response such as Reinforcement Learning.

More Abhijeet Sahu's questions See All
Similar questions and discussions