Enhancing robotic skill acquisition with multimodal sensory data: A novel dataset for kitchen tasks
Ruochen Ren, Zhipeng Wang, Chaoyun Yang, Jiahang Liu, Rong Jiang, Yanmin Zhou, Shuo Jiang, Bin He
- 发表年份
- 2025
- 引用次数
- 5
- 访问权限
- 开放获取
摘要
The advent of large language models has transformed human-robot interaction by enabling robots to execute tasks via natural language commands. However, these models primarily depend on unimodal data, which limits their ability to integrate diverse and essential environmental, physiological, and physical data. To address the limitations of current unimodal dataset problems, this paper investigates the novel and comprehensive multimodal data collection methodologies which can fully capture the complexity of human interaction in the complex real-world kitchen environments. Data related to the use of 17 different kitchen tools by 20 adults in dynamic scenarios were collected, including human tactile information, EMG signals, audio data, whole-body movement, and eye-tracking data. The dataset is comprised of 680 segments (~11 hours) with data across seven modalities and includes 56,000 detailed annotations. This paper bridges the gap between real-world multimodal data and embodied AI, paving the way for a new benchmark in utility and repeatability for skill learning in robotics areas.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002