I am studying the experimental application of reinforcement learning (RL) based control methods. For that reason, I modeled a nonlinear system without delay and trained the RL agent. In this particular situation, deep deterministic policy gradient (DDPG) method is employed. The trained agent worked well on a non-delayed system.
As you might guess, the control delay inseparable part of the experimental systems and it worsens performance of the control method. This is also true for my DDPG agent's performance on the experimental setup. When I compare the results of simulation and experimental studies, the difference is considerably high.
I wonder that what kind of RL based control methods are suggested for experimental systems with delay?