Uncertainty in Artificial Intelligence
First Name   Last Name   Password   Forgot Password   Log in!
    Proceedings         Authors   Author's Info   Article details         Search    
Improving Gradient Estimation by Incorporating Sensor Data
Gregory Lawrence, Stuart Russell
An efficient policy search algorithm should estimate the local gradient of the objective function, with respect to the policy parameters, from as few trials as possible. Whereas most policy search methods estimate this gradient by observing the rewards obtained during policy trials, we show, both theoretically and empirically, that taking into account the sensor data as well gives better gradient estimates and hence faster learning. The reason is that rewards obtained during policy execution vary from trial to trial due to noise in the environment; sensor data, which correlates with the noise, can be used to partially correct for this variation, resulting in an estimatorwith lower variance.
Keywords: null
Pages: 375-382
PS Link:
PDF Link: /papers/08/p375-lawrence.pdf
AUTHOR = "Gregory Lawrence and Stuart Russell",
TITLE = "Improving Gradient Estimation by Incorporating Sensor Data",
BOOKTITLE = "Proceedings of the Twenty-Fourth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-08)",
ADDRESS = "Corvallis, Oregon",
YEAR = "2008",
PAGES = "375--382"

hosted by DSL   •   site info   •   help