Dear all,
I am currently developing a reinforcement learning agent for the optimal control of energy systems using python. Please, how can I plot the graph of the selected agent's cumulative reward vs episodes number and penalty values vs episode number during or after training?
Also, how can I train different agents using the same environment, and plot their cumulative reward vs episodes number on the same figure?
I have seen the application in many research articles on DRL.
Any recommendation, links, or python code sample (preferably python) for the Implementation will be appreciated.
Best Regards,
Michael.