Sample efficient reward learning for reinforcement learning

David Lindner Active Learning, Preference Learning, Reinforcement Learning


Figure 1: The robot needs to learn the user’s food preferences to decide what to collect. We propose a method that can significantly reduce the number of queries necessary by focusing on queries that are informative about which policy is optimal.

Recently, reinforcement learning (RL) has shown impressive performance on tasks with a well-specified reward function, such as Atari games. Unfortunately, a reward function is often not available in the real world. Say you want to train an RL agent to drive a car. What is a good reward function for driving? Often researchers hand-craft complicated reward functions for such tasks, but this is cumbersome and prone to error. More generally, misspecified rewards can lead to unintended and unsafe behavior due to specification gaming.

A promising alternative is to learn a model of the reward from human feedback. By, for example, asking humans to compare trajectories and judge which one solves a task better, we can learn a reward function for tasks that are difficult for humans to specify manually.… Read more