Terrain Consistent Reference-Guided RL for Humanoid Navigation Autonomy
William D. Compton, Zachary Olkin, Aaron D. Ames
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
We present a method for training reference-guided, perceptive reinforcement learning locomotion policies for humanoid robots in which reference trajectories are modulated in training to be consistent with terrain geometry. Aiming to deploy our method with standard navigation autonomy infrastructure, we synthesize SE(2)-controllable reference trajectories inside the RL training loop, projecting desired footsteps onto valid footholds and adjusting swing-foot and center-of-mass trajectories to match the terrain. The resulting policy exposes a clean SE(2) velocity interface compatible with standard navigation planners. In simulation, environmentally-conditioned references significantly improve reference tracking performance compared to environment agnostic references. On hardware, we integrate the policy with an MPC + control barrier function planner and demonstrate long-horizon (>70m) closed-loop autonomous navigation on the Unitree G1 through outdoor environments containing rough terrain and consecutive flights of stairs, with all sensing and computation onboard.
关键词
相关论文
Trust Region Policy Optimization
John Schulman, Sergey Levine, Philipp Moritz 等 5 位作者
2015
Legged Robots That Balance
Marc H. Raibert, Ernest R. Tello
1986
Being there: putting brain, body, and world together again
1997
Small-scale soft-bodied robot with multimodal locomotion
Wenqi Hu, Guo Zhan Lum, Massimo Mastrangeli 等 4 位作者
2018