Home /Research /Context-aware data augmentation for enhanced speech command recognition in industrial environments
HRI

Context-aware data augmentation for enhanced speech command recognition in industrial environments

Giuseppe De Simone, Antonio Greco, Francesco Giuseppe De Rosa, Alessia Saggese, Mario Vento

Year
2025
Citations
4
Access
Open access

Abstract

In Human-Robot Interaction, speech is one of the most intuitive and effective communication channel. In Industry 4.0, speech-based communication can significantly enhance productivity and efficiency on production lines. However, deploying a Speech Command Recognition Module in real-world industrial settings poses challenges, as the system must balance two conflicting objectives: accurately recognizing commands while rejecting noise and irrelevant speech. To address this, we propose a modular framework designed to optimize recognition accuracy and rejection robustness while minimizing the need for extensive industrial dataset collection. The framework features an efficient Command Recognition module trained on laboratory-collected data augmented with synthetic samples. Advanced context-aware data augmentation techniques and dynamic noise injection further enhance the model's robustness. To improve reliability in noisy environments, a Keyword Spotting module is introduced, activating the recognition system only when a predefined keyword is detected. The proposed system was evaluated using real-world samples collected in a noisy industrial setting. The results demonstrated a high recall rate for both command recognition and noise rejection, confirming the system's effectiveness in meeting the demands of industrial applications.

Keywords

Computer scienceContext (archaeology)Speech recognitionBiology

Related papers

Browse all HRI papers