Can anyone please suggest any paper/book/article for PAC anlysis of LSPI and for fitted q-learning?

Mostafa Eidiani Popular answer

Please see these links:

http://arxiv.org/pdf/1004.2027

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.77.5239&rep=rep1&type=pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.309.8387&rep=rep1&type=pdf

http://rl-projects.googlecode.com/hg-history/a8d9bee184fd9a1248630b06561558e6073c5de1/RL-papers/Li09Unifying.pdf

Amir-massoud Farahmand

Hi,

There are many sources to learn about PAC results (or more generally, Statistical Learning Theory), especially if the focus is on the supervised learning setting.

I have compiled an incomplete list of resources that might be helpful at the end of these slides:

http://www.cs.mcgill.ca/~dprecup/courses/ML/Lectures/ml-lecture13and14.pdf

The analysis of an RL algorithms is usually more difficult than the analysis of supervised learning algorithms.

For the theoretical analysis of Fitted Q-Iteration (an Approximate Value Iteration algorithm), take a look at the following paper:

Remi Munos and Csaba Szepesvari, "Finite Time Bounds for Fitted Value Iteration," JMLR, 2008.

For LSPI-like algorithms, take a look at the following papers:

* Andras Antos, Csaba Szepesvari, and Remi Munos,"Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path," Machine Learning, 2008.

* Alessandro Lazaric, Mohammad Ghavamzadeh, and Remi Munos. "Finite-Sample Analysis of Least-Squares Policy Iteration," JMLR, 2012.

For the regularized variants of these algorithms (i.e., Regularized Fitted Q-Iteration and Regularized LSPI), take a look at my PhD thesis:

Amir-massoud Farahmand, Regularization in Reinforcement Learning, 2011.

(available at http://hdl.handle.net/10048/2387)

Hope it helps.

Amir-massoud

Mostafa Eidiani

Please see these links:

http://arxiv.org/pdf/1004.2027

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.77.5239&rep=rep1&type=pdf

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.309.8387&rep=rep1&type=pdf

http://rl-projects.googlecode.com/hg-history/a8d9bee184fd9a1248630b06561558e6073c5de1/RL-papers/Li09Unifying.pdf

What are the parameters in Virtual machines?

Feedback defines the constitution of an organism?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Measuring the Intelligence of a Species?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

Could dyes amplify the spectrum of light to a specific wavelength?

How to report results of Generalised Linear Mixed Models in a journal article?

Need help with my research project on open source SIEM and machine learning?

Swimming/space travel depends on the proprioceptive muscle spindles?

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?