Ego-Vision World Model for Humanoid Contact Planning
Hang Liu, Yuman Gao, Sangli Teng, Yufeng Chi, Yakun Sophia Shao, Zhongyu Li, Maani Ghaffari, Koushil Sreenath
- Year
- 2025
- Access
- Open access
Abstract
Enabling humanoid robots to exploit physical contact, rather than simply avoid collisions, is crucial for autonomy in unstructured environments. Traditional optimization-based planners struggle with contact complexity, while on-policy reinforcement learning (RL) is sample-inefficient and has limited multi-task ability. We propose a framework combining a learned world model with sampling-based Model Predictive Control (MPC), trained on a demonstration-free offline dataset to predict future outcomes in a compressed latent space. To address sparse contact rewards and sensor noise, the MPC uses a learned surrogate value function for dense, robust planning. Our single, scalable model supports contact-aware tasks, including wall support after perturbation, blocking incoming objects, and traversing height-limited arches, with improved sample efficiency and multi-task capability over on-policy RL. Deployed on a physical humanoid, our system achieves robust, real-time contact planning from proprioception and ego-centric depth images. Code and dataset are available at our website: https://ego-vcp.github.io/
Keywords
Related papers
The Organization of Behavior
D. O. Hebb
2005
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi +7 more
2021
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar +7 more
2018