Skip to main content
Log in

Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy

  • Articles
  • Published:
Zeitschrift für Operations Research Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this paper we construct an algorithm of successive approximation type that computes a locally optimal periodic policy for a Markov decision chain with partial state information. The algorithm is applied to a queueing network with server control.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altman E, Nain Ph (1992) Closed-loop control with delayed information. Performance Evaluation Review 20:193–204

    Google Scholar 

  2. Bertsekas DP (1987) Dynamic programming, deterministic and stochastic models. Prentice-Hall, Englewood Cliffs, New Jersey

    Google Scholar 

  3. Chang CS, Chao X, Pinedo M (1990) A note on queues with Bernoulli routing. Proceedings of the 29th Conference on Decision and Control

  4. Hastings NAJ, Sadjadi D (1979) Markov programming with policy constraints. European Journal of Operations Research 3:253–255

    Google Scholar 

  5. Hordijk A, Koole G (1993) On the optimality of LEPT andμc-rules for parallel processors and dependent arrival processes. Advances in Applied Probability 25:979–996

    Google Scholar 

  6. Hordijk A, Koole G, Loeve JA (1993) Analysis of a customer assignment model with no state information. Technical Report TW-93-15, Leiden University. To appear in Probability in the Engineering and Informational Sciences

  7. Hsu K, Marcus IM (1982) Decentralized control of finite state Markov processes. IEEE Transactions on Automatic Control AC-27, no. 2:426–431

    Google Scholar 

  8. Koole G (1992) Stochastic scheduling and dynamic programming. Ph.D. Thesis, Leiden University

  9. Kulkarni VG, Serin Y (1990) Optimal implementable policies: Discounted cost case. Technical Report no. UNC/OR/TR-90/2

  10. Kulkarni VG, Serin Y (1990) Optimal implementable policies: Average cost and minimax criteria. Technical Report no. UNC/OR TR-90/9

  11. Lovejoy WS (1991) A survey of algorithmic methods for partially observed Markov decision processes. Annals of Operations Research 28:47–65

    Google Scholar 

  12. Monahan GE (1982) A survey of partially observable Markov decision processes: Theory, models and algorithms. Management Science 28:1–16

    Google Scholar 

  13. Schneeberger S (1992) Markov-Entscheidungs-Prozesse mit abhängigen Aktionen für optimale Reparaturmaßnahmen bei unvollständiger Information. OR Spektrum 14:71–78

    Google Scholar 

  14. Schoute FC (1978) Decentralized control in packet switched satellite communication. IEEE Transactions on Automatic Control AC-23, no. 2:362–371

    Google Scholar 

  15. Smallwood R, Sondik E (1973) The optimal control of partially observable Markov processes over a finite horizon. Operations Research 21:1071–1088

    Google Scholar 

  16. Smith JL (1971) Markov decisions on a partitioned state space. IEEE Transactions on Systems, Man and Cybernetics SMC-1, no. 1:55–60

    Google Scholar 

  17. Sondik E (1978) The optimal control of partially observable Markov processes over the infinite horizon: discounted costs. Operations Research 26:282–304

    Google Scholar 

  18. Tijms HC (1988) Stochastic modelling and analysis: A computational approach. John Wiley & Sons Ltd., Chichester

    Google Scholar 

  19. White CC (1991) A survey of solution techniques for the partially observable Markov decision processes. Annals of Operations Research 32:215–230

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The research of this author has been supported by the Netherlands Organization for Scientific Research (N.W.O.).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hordijk, A., Loeve, J.A. Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy. ZOR - Methods and Models of Operations Research 40, 163–181 (1994). https://doi.org/10.1007/BF01432808

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01432808

Key words

Navigation