Home /Research /Learning Behaviours for Decentralised Multi-Robot Collision Avoidance in Constrained Pathways Using Curriculum Reinforcement Learning
SWARM

Learning Behaviours for Decentralised Multi-Robot Collision Avoidance in Constrained Pathways Using Curriculum Reinforcement Learning

Md Mostafizur Rahman Komol, Brendan Tidd, Will N. Browne, Frédéric Maire, Jason Williams

Year
2025
Citations
4

Abstract

Mobile robot teams often require decentralised autonomous navigation through narrow gaps in limited communication environments (e.g., underground search-and-rescue operations). Existing navigation approaches exhibit suboptimal performance for avoiding multi-robot collisions in such bottlenecks due to an inability to address the dynamic nature of the robots. Initial work utilising reinforcement learning has demonstrated success in navigating a single robot through narrow gaps. However, when training agents to produce give-way behaviour for navigating through constrained gaps, end-to-end reinforcement learning using simple rewards suffers from slow convergence due to the increased search space of viable policies. This paper introduces a novel curriculum reinforcement learning framework, incorporating a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multi-robot bootstrap curriculum</i> with preprogrammed behaviour to guide initial policy formation, subsequently refined by a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">gap curriculum</i> that progressively reduces training complexity towards an optimal policy. This framework learns multi-robot interaction behaviours, which are impractical to program manually. Our model achieves a 99% success-rate in give-way behaviour generation without inter-agent communications in high-fidelity simulations. The success-rate reduced to 73% in simulations incorporating noisy sensors, and 60% in field-robot tests, substantiating our model's practical viability despite sensor noise and real-world uncertainties. The simple benchmark methods lack efficiency with basic interaction behaviours.

Keywords

Reinforcement learningCollision avoidanceComputer scienceRobotReinforcementCollisionCurriculumArtificial intelligenceHuman–computer interactionPsychology

Related papers

Browse all SWARM papers