Hierarchical Prompting with Dual LLM Modules for Robotic Task and Motion Planning
Karolina Źróbek, Tessa Pulli, Paweł Gajewski, Antonio Galiza Cerdeira Gonzalez, Bipin Indurkhya
- Year
- 2026
- Access
- Open access
Abstract
We present a hierarchical language-driven framework for robotic task and motion planning to improve natural, intuitive human-robot interaction in service and assistance scenarios. The proposed system employs two large language model (LLM) modules: a high-level planning agent and a low-level spatial reasoning sub-module. The primary agent processes natural language commands and generates action sequences using a ReAct-style prompt, interacting with tools for object perception and manipulation (e.g., pick, place, release). For precise spatial placement, such as interpreting "place the mug next to the plate", a separate sub-prompting module handles 3D reasoning based on object geometry and scene layout. The system integrates YOLOX-GDRNet for object detection and pose estimation, along with a motion execution stub. We evaluated the system in 24 test scenarios, ranging from simple spatial commands to high-level instructions and infeasible requests. The system achieved an overall task success rate of 86%.
Keywords
Related papers
The Uncanny Valley [From the Field]
Masahiro Mori, Karl F. MacDorman, Norri Kageki
2012
Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots
Christoph Bartneck, Dana Kulić, Elizabeth A. Croft +1 more
2008
The development of Honda humanoid robot
Kazuo Hirai, Masato Hirose, Y. Haikawa +1 more
2002
A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction
Peter A. Hancock, Deborah R. Billings, Kristin E. Schaefer +3 more
2011