PERCEPTION
具身3D基准:评估视觉语言模型的低级具身空间智能
Jiyao Zhang, Mingxu Zhang, Yitong Peng, Haoxuan Liu, Chenshuo Wang, Yuxing Long, Haoyang Huang, Dongjiang Li, Nan Duan, Hui Shen, Hao Dong
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
该论文提出了Embodied3DBench,一个针对视觉语言模型在具身3D环境中低级空间智能的基准测试,包含6个任务类别和超过21k个问答对。评估结果显示当前模型在高层空间推理上表现良好,但在面向交互的感知方面仍然脆弱,表明缺乏鲁棒的3D感知交互先验。
关键词
embodied AIspatial intelligencevision language modelsbenchmark3D perception
相关论文
PERCEPTION
📊 22,245 引用
Artificial intelligence: a modern approach
1995
PERCEPTION
📊 14,348 引用
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger, P Lenz, R. Urtasun
2012
PERCEPTION
开放获取📊 9,777 引用
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martı́n Abadi, Ashish Agarwal, Paul Barham 等 20 位作者
2016
PERCEPTION
📊 9,681 引用
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller 等 4 位作者
2013