AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence
Jiawei Zhang, Kaizhe Hu, Yingqian Huang, Yuanchen Ju, Zhengrong Xue, Huazhe Xu
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Despite the recent success of modern imitation learning methods in robot manipulation, their performance is often constrained by geometric variations due to limited data diversity. Leveraging powerful 3D generative models and vision foundation models (VFMs), the proposed AffordGen framework overcomes this limitation by utilizing the semantic correspondence of meaningful keypoints across large-scale 3D meshes to generate new robot manipulation trajectories. This large-scale, affordance-aware dataset is then used to train a robust, closed-loop visuomotor policy, combining the semantic generalizability of affordances with the reactive robustness of end-to-end learning. Experiments in simulation and the real world show that policies trained with AffordGen achieve high success rates and enable zero-shot generalization to truly unseen objects, significantly improving data efficiency in robot learning.
关键词
相关论文
Real-Time Obstacle Avoidance for Manipulators and Mobile Robots
Oussama Khatib
1986
A Mathematical Introduction to Robotic Manipulation
Richard M. Murray, Zexiang Li, Shankar Sastry
2017
Robot dynamics and control
Mark W. Spong
1989
A tutorial on visual servo control
Seth Hutchinson, Gregory D. Hager, Peter Corke
1996