Uncertainty in Artificial Intelligence
First Name   Last Name   Password   Forgot Password   Log in!
    Proceedings         Authors   Author's Info   Article details         Search    
Learning Finite-State Controllers for Partially Observable Environments
Nicolas Meuleau, Leonid Peshkin, Kee-Eung Kim, Leslie Kaelbling
Abstract:
Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPs), but some kind of memory is usually necessary for optimal control of a partially observable MDP. Policies with finite memory can be represented as finite-state automata. In this paper, we extend Baird and Moore's VAPS~algorithm to the problem of learning general finite-state automata. Because it performs stochastic gradient descent, this algorithm can be shown to converge to a locally optimal finite-state controller. We provide the details of the algorithm and then consider the question of under what conditions stochastic gradient descent will outperform exact gradient descent. We conclude with empirical results comparing the performance of stochastic and exact gradient descent, and showing the ability of our algorithm to extract the useful information contained in the sequence of past observations to compensate for the lack of observability at each time-step.
Keywords: POMPDP, Finite-state Controller, Reinforcement Learning
Pages: 427-436
PS Link: http://www.cs.brown.edu/people/nm/PS_files/uai99_2.ps
PDF Link: /papers/99/p427-meuleau.pdf
BibTex:
@INPROCEEDINGS{Meuleau99,
AUTHOR = "Nicolas Meuleau and Leonid Peshkin and Kee-Eung Kim and Leslie Kaelbling",
TITLE = "Learning Finite-State Controllers for Partially Observable Environments",
BOOKTITLE = "Proceedings of the Fifteenth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-99)",
PUBLISHER = "Morgan Kaufmann",
ADDRESS = "San Francisco, CA",
YEAR = "1999",
PAGES = "427--436"
}


hosted by DSL   •   site info   •   help