首页 /研究 /引导、思考、行动：视觉-语言-动作模型中的交互式具身推理

HRI

引导、思考、行动：视觉-语言-动作模型中的交互式具身推理

Yiran Ling, Qing Lian, Jinghang Li, Qing Jiang, Tianming Zhang, Xiaoke Jiang, Chuanxiu Liu, Jie Liu, Lei Zhang

发表年份: 2026
引用次数: 0
访问权限: 开放获取

摘要

本文提出GTA-VLA框架，通过允许用户以显式视觉线索引导机器人策略，实现空间可控的具身推理。该框架将外部空间引导与内部任务规划统一为空间-视觉思维链，解决了现有模型在域外偏移和错误纠正方面的局限性。

关键词

vision-language-actionembodied reasoninghuman-robot interactionspatial guidancechain-of-thought

相关论文

HRI

📊 3,196 引用

The Uncanny Valley [From the Field]

Masahiro Mori, Karl F. MacDorman, Norri Kageki

2012

HRI

开放获取📊 3,034 引用

Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots

Christoph Bartneck, Dana Kulić, Elizabeth A. Croft 等 4 位作者

2008

📄 PDF 详情 →

HRI

📊 1,925 引用

The development of Honda humanoid robot

Kazuo Hirai, Masato Hirose, Y. Haikawa 等 4 位作者

2002

HRI

📊 1,914 引用

A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction

Peter A. Hancock, Deborah R. Billings, Kristin E. Schaefer 等 6 位作者

2011

查看 HRI 分类全部论文