Say, to optimize buy/sell controls from a portfolio with transaction costs over 180 periods
Theoretically, you use spectral decomposition, search for viscosity solutions ,
discretize, etc then solve backwards ... but run into curse of dimension or it yields too difficult to interpret and unstable solutions anyhow. Some adaptive online sampling methods seem to work sufficiently well (Q-learning, TD learning, NDP, SMC etc.) . Has anyone used them ? Thanks