Abstract
In this paper we construct an algorithm of successive approximation type that computes a locally optimal periodic policy for a Markov decision chain with partial state information. The algorithm is applied to a queueing network with server control.
Similar content being viewed by others
References
Altman E, Nain Ph (1992) Closed-loop control with delayed information. Performance Evaluation Review 20:193–204
Bertsekas DP (1987) Dynamic programming, deterministic and stochastic models. Prentice-Hall, Englewood Cliffs, New Jersey
Chang CS, Chao X, Pinedo M (1990) A note on queues with Bernoulli routing. Proceedings of the 29th Conference on Decision and Control
Hastings NAJ, Sadjadi D (1979) Markov programming with policy constraints. European Journal of Operations Research 3:253–255
Hordijk A, Koole G (1993) On the optimality of LEPT andμc-rules for parallel processors and dependent arrival processes. Advances in Applied Probability 25:979–996
Hordijk A, Koole G, Loeve JA (1993) Analysis of a customer assignment model with no state information. Technical Report TW-93-15, Leiden University. To appear in Probability in the Engineering and Informational Sciences
Hsu K, Marcus IM (1982) Decentralized control of finite state Markov processes. IEEE Transactions on Automatic Control AC-27, no. 2:426–431
Koole G (1992) Stochastic scheduling and dynamic programming. Ph.D. Thesis, Leiden University
Kulkarni VG, Serin Y (1990) Optimal implementable policies: Discounted cost case. Technical Report no. UNC/OR/TR-90/2
Kulkarni VG, Serin Y (1990) Optimal implementable policies: Average cost and minimax criteria. Technical Report no. UNC/OR TR-90/9
Lovejoy WS (1991) A survey of algorithmic methods for partially observed Markov decision processes. Annals of Operations Research 28:47–65
Monahan GE (1982) A survey of partially observable Markov decision processes: Theory, models and algorithms. Management Science 28:1–16
Schneeberger S (1992) Markov-Entscheidungs-Prozesse mit abhängigen Aktionen für optimale Reparaturmaßnahmen bei unvollständiger Information. OR Spektrum 14:71–78
Schoute FC (1978) Decentralized control in packet switched satellite communication. IEEE Transactions on Automatic Control AC-23, no. 2:362–371
Smallwood R, Sondik E (1973) The optimal control of partially observable Markov processes over a finite horizon. Operations Research 21:1071–1088
Smith JL (1971) Markov decisions on a partitioned state space. IEEE Transactions on Systems, Man and Cybernetics SMC-1, no. 1:55–60
Sondik E (1978) The optimal control of partially observable Markov processes over the infinite horizon: discounted costs. Operations Research 26:282–304
Tijms HC (1988) Stochastic modelling and analysis: A computational approach. John Wiley & Sons Ltd., Chichester
White CC (1991) A survey of solution techniques for the partially observable Markov decision processes. Annals of Operations Research 32:215–230
Author information
Authors and Affiliations
Additional information
The research of this author has been supported by the Netherlands Organization for Scientific Research (N.W.O.).
Rights and permissions
About this article
Cite this article
Hordijk, A., Loeve, J.A. Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy. ZOR - Methods and Models of Operations Research 40, 163–181 (1994). https://doi.org/10.1007/BF01432808
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01432808