Justin Fu*, Avi Singh*, Dibya Ghosh, Larry Yang, Sergey Levine
In this paper, we present VICE, a method which generalizes inverse optimal control to learning reward functions from goal examples. By requiring only goal examples, and not demonstration trajectories, VICE is well suited for real-world robotics tasks, in which reward specification is difficult.
Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine
In this paper, we present a method for scaling model-free deep reinforcement learning methods to tasks with high stochasticity in initial state and task distributions. We demonstrate our method on a suite of challenging sparse-reward manipulation tasks that were unsolved by prior work.