by , , , , , ,
Abstract:
In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.
Reference:
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization A. Marco, F. Berkenkamp, P. Hennig, A. P. Schoellig, A. Krause, S. Schaal, S. TrimpeIn Proc. of the International Conference on Robotics and Automation (ICRA), 2017
Bibtex Entry:
@inproceedings{marco17virtualvsreal,
	author = {Alonso Marco and Felix Berkenkamp and Philipp Hennig and Angela P. Schoellig and Andreas Krause and Stefan Schaal and Sebastian Trimpe},
	booktitle = {Proc. of the International Conference on Robotics and Automation (ICRA)},
	month = {May},
	pages = {1557--1563},
	title = {Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization},
	year = 2017}