Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy

Hordijk, A.; Loeve, J. A.

doi:10.1007/BF01432808

Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy

Articles
Published: June 1994

Volume 40, pages 163–181, (1994)
Cite this article

Zeitschrift für Operations Research Aims and scope Submit manuscript

A. Hordijk¹ &
J. A. Loeve¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this paper we construct an algorithm of successive approximation type that computes a locally optimal periodic policy for a Markov decision chain with partial state information. The algorithm is applied to a queueing network with server control.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method

Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Altman E, Nain Ph (1992) Closed-loop control with delayed information. Performance Evaluation Review 20:193–204
Google Scholar
Bertsekas DP (1987) Dynamic programming, deterministic and stochastic models. Prentice-Hall, Englewood Cliffs, New Jersey
Google Scholar
Chang CS, Chao X, Pinedo M (1990) A note on queues with Bernoulli routing. Proceedings of the 29th Conference on Decision and Control
Hastings NAJ, Sadjadi D (1979) Markov programming with policy constraints. European Journal of Operations Research 3:253–255
Google Scholar
Hordijk A, Koole G (1993) On the optimality of LEPT andμc-rules for parallel processors and dependent arrival processes. Advances in Applied Probability 25:979–996
Google Scholar
Hordijk A, Koole G, Loeve JA (1993) Analysis of a customer assignment model with no state information. Technical Report TW-93-15, Leiden University. To appear in Probability in the Engineering and Informational Sciences
Hsu K, Marcus IM (1982) Decentralized control of finite state Markov processes. IEEE Transactions on Automatic Control AC-27, no. 2:426–431
Google Scholar
Koole G (1992) Stochastic scheduling and dynamic programming. Ph.D. Thesis, Leiden University
Kulkarni VG, Serin Y (1990) Optimal implementable policies: Discounted cost case. Technical Report no. UNC/OR/TR-90/2
Kulkarni VG, Serin Y (1990) Optimal implementable policies: Average cost and minimax criteria. Technical Report no. UNC/OR TR-90/9
Lovejoy WS (1991) A survey of algorithmic methods for partially observed Markov decision processes. Annals of Operations Research 28:47–65
Google Scholar
Monahan GE (1982) A survey of partially observable Markov decision processes: Theory, models and algorithms. Management Science 28:1–16
Google Scholar
Schneeberger S (1992) Markov-Entscheidungs-Prozesse mit abhängigen Aktionen für optimale Reparaturmaßnahmen bei unvollständiger Information. OR Spektrum 14:71–78
Google Scholar
Schoute FC (1978) Decentralized control in packet switched satellite communication. IEEE Transactions on Automatic Control AC-23, no. 2:362–371
Google Scholar
Smallwood R, Sondik E (1973) The optimal control of partially observable Markov processes over a finite horizon. Operations Research 21:1071–1088
Google Scholar
Smith JL (1971) Markov decisions on a partitioned state space. IEEE Transactions on Systems, Man and Cybernetics SMC-1, no. 1:55–60
Google Scholar
Sondik E (1978) The optimal control of partially observable Markov processes over the infinite horizon: discounted costs. Operations Research 26:282–304
Google Scholar
Tijms HC (1988) Stochastic modelling and analysis: A computational approach. John Wiley & Sons Ltd., Chichester
Google Scholar
White CC (1991) A survey of solution techniques for the partially observable Markov decision processes. Annals of Operations Research 32:215–230
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Leiden, Niels Bohrweg 1, 2333, CA Leiden, The Netherlands
A. Hordijk & J. A. Loeve

Authors

A. Hordijk
View author publications
You can also search for this author in PubMed Google Scholar
J. A. Loeve
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

The research of this author has been supported by the Netherlands Organization for Scientific Research (N.W.O.).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hordijk, A., Loeve, J.A. Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy. ZOR - Methods and Models of Operations Research 40, 163–181 (1994). https://doi.org/10.1007/BF01432808

Download citation

Received: 15 September 1993
Revised: 15 January 1994
Issue Date: June 1994
DOI: https://doi.org/10.1007/BF01432808

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy

Abstract

Access this article

Similar content being viewed by others

Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method

Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy

Abstract

Access this article

Similar content being viewed by others

Markov Decision Processes with Discounted Rewards: Improved Successive Over-Relaxation Method

Markov Decision Processes with Discounted Costs: Improved Successive Over-Relaxation Method

Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation