Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning
Severin Bochem, Eduardo Gonzalez-Sanchez, Yves Bicker, Gabriele Fadini
- 发表年份
- 2024
- 访问权限
- 开放获取
摘要
Reinforcement learning often requires extensive training data. Simulation-to-real transfer offers a promising approach to address this challenge in robotics. While differentiable simulators offer improved sample efficiency through exact gradients, they can be unstable in contact-rich environments and may lead to poor generalization. This paper introduces a novel approach integrating sharpness-aware optimization into gradient-based reinforcement learning algorithms. Our simulation results demonstrate that our method, tested on contact-rich environments, significantly enhances policy robustness to environmental variations and action perturbations while maintaining the sample efficiency of first-order methods. Specifically, our approach improves action noise tolerance compared to standard first-order methods and achieves generalization comparable to zeroth-order methods. This improvement stems from finding flatter minima in the loss landscape, associated with better generalization. Our work offers a promising solution to balance efficient learning and robust sim-to-real transfer in robotics, potentially bridging the gap between simulation and real-world performance.
关键词
相关论文
Trust Region Policy Optimization
John Schulman, Sergey Levine, Philipp Moritz 等 5 位作者
2015
Legged Robots That Balance
Marc H. Raibert, Ernest R. Tello
1986
Being there: putting brain, body, and world together again
1997
Small-scale soft-bodied robot with multimodal locomotion
Wenqi Hu, Guo Zhan Lum, Massimo Mastrangeli 等 4 位作者
2018