Shows how to compute policies that maximize a certain reasonable criterion (i.e., policies that are [(g,w)]-optimal) for the undiscounted infinite-horizon versions of two Markov decision problems--one a discrete-time model and the other a continuous-time model. The approach used is to parse the overall problem into at most three smaller problems that can be solved in sequence by the methods of linear programming or policy iteration. 33 pp. Ref
This report is part of the RAND Corporation research memorandum series. The Research Memorandum was a product of the RAND Corporation from 1948 to 1973 that represented working papers meant to report current results of RAND research to appropriate audiences.
The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.