DayDreamer, world for physical robot learning, BAIR, Berkeley, California, USA


DayDreamer: world models for physical robot learning

Jun 25, 2022

Robot learning typically requires too much trial and error to learn directly in the real world. We apply the Dreamer algorithm for sample-efficient learning on 4 physical robots from scratch, without simulators. In contrast to typical deep reinforcement learning algorithms, Dreamer learns a world model of its environment. The world model allows Dreamer to play through scenarios in imagination, reducing the amount of trial and error needed in the environment. Using Dreamer, the A1 quadruped robot learns to roll off its back, stand up, and start walking in only 1 hour of training from scratch. Within 10 minutes of pushing, the robot becomes robust to such perturbations and quickly rolls back on its feet. The UR5 and XArm robot arms learn to pick and place objects directly from camera images and sparse rewards. Results show that Dreamer enables fast learning and adaptation for future robotics applications. We release our software infrastructure with support for multiple sensory modalities and action spaces.
 
Back
Top