Home /Research /LLM-based Agentic Workflow on Verbal and Non-verbal Audiovisual Perceptions and Actions for Proactive Situated Human-Robot Interactions

HRI

LLM-based Agentic Workflow on Verbal and Non-verbal Audiovisual Perceptions and Actions for Proactive Situated Human-Robot Interactions

Virgile Sucal, Ahmed Njifenjou, Fabrice Lefèvre

Year: 2025
Citations: 2

Abstract

Recent research on conversational robots has typically addressed dialogue management and situated action planning as distinct and independent tasks. This separation stems from their historical development, which relied on different computational models and implementation paradigms. However, the advent of instruction-tuned large language models (LLMs) has enabled the unification of these functionalities within agentic workflows. In this work, we propose a unified LLM-based workflow that integrates proactive action planning, spoken dialogue, and non-verbal communication into a single agentic policy. This approach leverages the zero-shot capabilities of instruction-tuned LLMs to perform a wide range of social behaviors through a single prompt-driven policy. To assess the effectiveness of the proposed method, we compare its performance with a rule-based system designed to support the same functionalities using multiple specialized policies. The evaluation aims to determine whether the LLM-based agentic workflow constitutes a viable alternative to traditional approaches that rely on separate policies to manage these social tasks.

Keywords

SituatedWorkflowAction (physics)PerceptionUnificationSpoken languageRobotHuman–robot interaction

LLM-based Agentic Workflow on Verbal and Non-verbal Audiovisual Perceptions and Actions for Proactive Situated Human-Robot Interactions

Abstract

Keywords

Related papers

Artificial intelligence: a modern approach

Self-Organizing Maps

Vision meets robotics: The KITTI dataset

Probabilistic robotics