I'm trying to reproduce the paper on Deep Reinforcement Learning-Based Spectrum Allocation in Integrated Access and Backhaul Networks. I need help with the implementation of the actor-critic algorithm for resource allocation in the IAB network. I need someone to enlighten me. Pls :(((