Repository logo
 

Gaussian Processes for Data-Efficient Learning in Robotics and Control.

Accepted version
Peer-reviewed

Repository DOI


Type

Article

Change log

Authors

Deisenroth, Marc Peter 
Fox, Dieter 
Rasmussen, Carl Edward 

Abstract

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

Description

Keywords

Policy search, robotics, control, Gaussian processes, Bayesian inference, reinforcement learning

Journal Title

IEEE Trans Pattern Anal Mach Intell

Conference Name

Journal ISSN

0162-8828
1939-3539

Volume Title

37

Publisher

Institute of Electrical and Electronics Engineers (IEEE)
Sponsorship
Engineering and Physical Sciences Research Council (EP/J012300/1)
The research leading to these results has received funding from the EC’s Seventh Framework Programme (FP7/2007-2013) under grant agreement #270327, ONR MURI grant N00014-09-1-1052, Intel Labs, and the Department of Computing, Imperial College London.