3 Questions 3 Answers 0 Followers
Questions related from Gavin Rens
Recently, some work has been done planning and learning in Non-Markovian Decision Processes, that is, decision-making with temporally extended rewards. In these settings, a particular reward is...
04 April 2019 8,660 2 View
What online (approximate) POMDP planning algorithm is the most effective, in general these days?
08 August 2016 1,150 0 View
When the reward function is defined as R(a,s), the value function is defined as max_{a in A} (rho(a,b) + ...), where A is the set of actions, b is the current belief state and rho(a,b) is the...
05 May 2016 2,975 3 View