We consider discounted Markov decision processes (MDPs) with countably-infinite state spaces, finite action spaces, and unbounded rewards. Typical examples of such MDPs are inventory management and ...
These criteria are useful when you want to divide a time-consuming optimization problem into a series of smaller problems. Since the Nelder-Mead simplex algorithm does not use derivatives, no ...