Compared to policy-based Deep RL algorithms, does this category of algorithms have a lower exploration efficiency ?

I found this information in the following article:

"Optimal energy management strategies for energy internet via deep reinforcement learning approach, Hua et al, 2019" but it didn't cite a source. Is it common knowledge in this expertise ?

Thank you !

More Yasser Hallou's questions See All
Similar questions and discussions