首页 /研究 /Real-time multimodal fusion and semantic mapping for robotic tower crane perception
PERCEPTION

Real-time multimodal fusion and semantic mapping for robotic tower crane perception

Yifan Lu, Xiuzhi DENG, Peter E.D. Love, Zhou We, Weili Fang

发表年份
2026
引用次数
3

摘要

Robotic tower crane operation requires real-time perception of complex and rapidly changing construction environments. Conventional Simultaneous Localization and Mapping (SLAM) methods assume smooth sensor motion and emphasize geometry over semantics, limiting their suitability for crane-mounted sensing affected by vibration, rotation, and intermittent movement. This research proposes a multimodal perception framework that integrates Light Detection and Ranging (LiDAR), camera, and Inertial Measurement Unit (IMU) data within a tightly coupled fusion and semantic reconstruction pipeline. A Mahony-filter-based attitude optimization module stabilizes high-frequency vibrations, while a Fast LiDAR-Inertial Odometry (FAST-LIVO2)-inspired LiDAR–visual–inertial fusion strategy achieves centimeter-level three-dimensional (3D) mapping. To enhance scene understanding, an improved Random Sampled and Lightweight Aggregated Network (RandLA-Net) jointly exploits geometric and visual cues for point-level semantic segmentation, with color-aware spatial encoding. Field deployment on an operational tower crane demonstrates superior performance, yielding the lowest global reconstruction errors and highest semantic accuracy. The framework provides a robust perception foundation for autonomous planning, safety monitoring, and intelligent lifting assistance.

关键词

OdometrySemantic mappingSensor fusionRangingPerceptionRobotVisual odometrySoftware deploymentActive perceptionTower crane

相关论文

查看 PERCEPTION 分类全部论文