Multimodal Interaction for Human-Robot Collaboration in Assembly: An LLM-Enhanced Approach
Khansa Rekik, Grimaldo Silva, Attique Bashir, Rainer Müller
- Year
- 2025
- Citations
- 2
Abstract
As Robot as a Service (RaaS) models gain intrest in industrial automation, the need for intuitive and adaptive human-robot interaction (HRI) increaces. This paper introduces a multimodal interction framework for human-robot collaboration in assembly tasks, enhanced by Large Language Models (LLMs). The system combines explicit user inputs—such as speech commands, gestures, and graphical interfaces—with implicit intent recognition to generate and prioritize tasks in real-time. Leveraging LLMs for natural language understanding and task planning, the approach enables flexible and adaptive task execution, allowing the robot to respond to both direct requests and contextual cues. Through a pilot user study, performance and user satisfaction of each modality are evaluated, revealing trade-offs between ease of use, response speed, and accuracy. The results demonstrate the promise of the approach in industrial applications, while also identifying improvements’ opportunities for broader use.
Keywords
Related papers
Artificial intelligence: a modern approach
1995
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002
Self-Organizing Maps
Teuvo Kohonen
1995
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller +1 more
2013