After a lot of effort I could not understand the PAC guarantee and fitted q-learning algorithm. I need a paper/book/article which explain these topics thoroughly.

Similar questions and discussions