Uncertainty in Artificial Intelligence
First Name   Last Name   Password   Forgot Password   Log in!
    Proceedings   Proceeding details   Article details         Authors         Search    
On the Complexity of Policy Iteration
Yishay Mansour, Satinder Singh
Abstract:
Decision-making problems in uncertain or stochastic domains are often formulated as Markov decision processes (MDPs). Policy iteration (PI) is a popular algorithm for searching over policy-space, the size of which is exponential in the number of states. We are interested in bounds on the complexity of PI that do not depend on the value of the discount factor. In this paper we prove the first such non-trivial, worst-case, upper bounds on the number of iterations required by PI to converge to the optimal policy. Our analysis also sheds new light on the manner in which PI progresses through the space of policies.
Keywords: MDPs, Policy Iteration, Complexity
Pages: 401-408
PS Link: http://www.research.att.com/~baveja/Papers/MansourSinghUAI99.ps.gz
PDF Link: /papers/99/p401-mansour.pdf
BibTex:
@INPROCEEDINGS{Mansour99,
AUTHOR = "Yishay Mansour and Satinder Singh",
TITLE = "On the Complexity of Policy Iteration",
BOOKTITLE = "Proceedings of the Fifteenth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-99)",
PUBLISHER = "Morgan Kaufmann",
ADDRESS = "San Francisco, CA",
YEAR = "1999",
PAGES = "401--408"
}


hosted by DSL   •   site info   •   help