首页 /研究 /Energy-aware multi-robot exploration and coverage in fragmented unknown environments using collaborative reinforcement learning
SWARM

Energy-aware multi-robot exploration and coverage in fragmented unknown environments using collaborative reinforcement learning

Yao Xue, Chee Keong Tan, Wai Peng Wong

发表年份
2026
引用次数
2

摘要

• Research Highlight 1: We propose a modular closed-loop framework named CERES-Q (Collaborative Energy-aware Reinforcement Exploration System with Q-learning), for fragmented and unknown environments such as post-disaster ruins. It systematically addresses the full-coverage multi-robot exploration problem in these challenging environments and transforms different strategies into coherent and deployable solutions. • Research highlight 2: This paper proposes novel CP-AFC and DTT modules, which provide clever solutions to the local topology connectivity problem and the uneven load problem of multiple robots in fragmented environments. • Research highlight 3: CERES-Q adopts the safety Double Q-learning (SDQ) and safe return switch (SRS) to combine coverage maximization with robot energy safety to achieve a true ”exploration-to-return” closed loop instead of threshold-based heuristic methods. • Research highlight 4: Extensive experimental comparisons show that CERES-Q achieves superior coverage and success rate per unit energy compared to Breadth-First Search (BFS), Boustrophedon, and Co-Explore. Under conditions of extreme fragmentation and severe power imbalance, CERES-Q is the only method capable of achieving high coverage and completely safe return. Navigating disaster-stricken environments presents significant challenges for autonomous multi-robot exploration, primarily due to extreme topological fragmentation characterized by an abundance of obstacles, narrow passages and numerous dead ends. Traditional multi-robot exploration strategies based on classical frontiers or static region partitioning, prove inadequate in these settings, leading to critical deficiencies in collaborative stability and energy safety, and a drastic reduction in overall exploration efficiency as robots frequently become stalled or disoriented. Therefore, this paper proposes the CERES-Q (Collaborative Energy-aware Reinforcement Exploration System with Q-learning), a novel closed-loop multi-robot exploration system for fragmented post-disaster scenarios that integrates a comprehensive ”perception-decision-planning-execution-return” framework. This system leverages a phase-aware frontier clustering and allocation (CP-AFC) module, a dynamic task transfer (DTT) module, a safe return switch (SRS) with a safety Double Q-learning (SDQ) strategy to ensure reliable and efficient exploration. Comparative experiments on maps with varying degrees of fragmentation show that, compared with traditional multi-robot exploration methods, CERES-Q improves coverage efficiency by about 20% and reduces energy consumption by about 25%; compared with the recent Co-Explore method, it achieves about 10% better performance in highly fragmented environments. The advantages are particularly pronounced in scenarios with narrow passages and multiple dead ends. The results show that the CERES-Q is the key to coping with unknown and fragmented terrain after disasters and achieving reliable exploration and high-quality mapping; at the same time, the modular implementation of the framework facilitates expansion to larger teams and actual deployment.

关键词

Reinforcement learningRobotModular designMaximizationHeuristicFragmentation (computing)Efficient energy use

相关论文

查看 SWARM 分类全部论文