How could I analyze the complexity of the double-deep Q-learning (DDQN) algorithm? Is it sensible to compare it with a normal deep neural network?
Is the required time for convergence with a state size of |S| and exploration rate ε bounded by O(|S| log |S| ([log(1/ε)]/ε2))?