I am implementing a SARSA(lambda) model in C++ to overcome some of the limitations (the sheer amount of time and space DP models require) of DP models, which hopefully will reduce the computation time (takes quite a few hours atm for similar research) and less space will allow adding more complexion to the model.Thing is, we do have explicit transition probabilities, and they do make a difference. So how should we incorporate them in a SARSA model? Simply select the next state according to the probabilities themselves? Apparently SARSA models don't exactly expect you to use probabilities - or perhaps I've been Reading the wrong books.
PS- Is there a way of knowing if the algorithm is properly implemented? First time working with SARSA.
Any help would be appreciated!