Recasts a class of infinite-state, infinite-action Markov renewal programs with unknown parameters as one-state programs with actions corresponding to stationary policies in the original program. Under suitable conditions, an adaptive (nonstationary) optimal policy is found in the sense of maximizing long-run expected reward per unit time. 26 pp. Ref.
This report is part of the RAND Corporation Report series. The report was a product of the RAND Corporation from 1948 to 1993 that represented the principal publication documenting and transmitting RAND's major research findings and final research.
The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.