Thanks, guys gaming is included in the control, my concern is about interaction in RL !! can we imagine applying RL after adaptation to other fields is there any research or paper talking about this ??
I have had a deeper look into the basis of reinforcement learning over the last years. What is important is that there are actually two frameworks of reinforcement learning. The most widely used and adapted one is the discounted framework. However it is only applicable if the underlying Markov decision process (MDP) does not accumulate rewards too often, i.e. it works for games where at the end of the episode you get rewarded with +1/-1 but not for many practical applications which include profit (in terms of rewards) in every period. This is due to the average reward that gets accumulated and influences the state(-action) values with $\mathsf{Avg Reward} / (1-\gamma)$. Thus if it is $>0$ it very soon i) increases the state action values tremendously and ii) dilutes the estimated state-action values s.t. it is very hard to distinguish between good and bad actions, as the average reward is (in recurrent MDPs) the same for all states.
If you want reinforcement learning for such systems you have to go into average reward reinforcement learning. I am currently working on a paper for this, but you can look into the early works of Mahadevan, possibly starting with this one:
Mahadevan, S. (1996). Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine learning, 22(1-3), 159-195.
Actually, when talking about optimality in the sense of reinforcement learning as optimization problem, then the average reward reinforcement learning theoretically can find better policies than discounted reinforcement learning. However, from a practical point of view this still has to be shown, and that's what I am working on right now.
Besides robotics ... games, multi-agent systems, recommender system, building systems control (ex: HVAC), or anything that you can roughly formalize as a MDP. I am applying to architectural design in my own research.
In general, reinforcement learning is a framework one can apply to all stochastic, unknown, sequential decision making tasks. "Unknown", in this case, means the actual model, i.e. the Markov decision process, and hence the dynamics of the environment is not known.
In our research, we focus on Spatio-temporal applications of reinforcement learning. We claim that due to the availability of more and more data in smart cities, e.g. parking sensors, existing routing algorithms can be improved. However, even if real-time information about the environment is available, future development is often stochastic and unknown. Our vision paper provides information about this general idea:
Article Vision Paper: Reinforcement Learning in Smart Spatio-Tempora...
An example of a concrete application where present work is already applying reinforcement learning in Spatio-temporal tasks is the ride-sharing, or taxicab dispatching task.
Simultaneous Electric Powertrain Hardware and Energy Management Optimization of a Hybrid Electric Vehicle Using Deep Reinforcement Learning and Bayesian Optimization
Conference Paper Simultaneous Electric Powertrain Hardware and Energy Managem...
I have finally brought together the paper showing where RL applications are applicable, or rather when discounted RL is not applicable, plus I established an algorithm that is applicable in such cases. Maybe this paper gives you more insight in the method:
Preprint Average Reward Adjusted Discounted Reinforcement Learning: N...