首页 /研究 /Policy Gradient with Self-Attention for Model-Free Distributed Nonlinear Multi-Agent Games

SWARM

Policy Gradient with Self-Attention for Model-Free Distributed Nonlinear Multi-Agent Games

Eduardo Sebastián, Maitrayee Keskar, Eeman Iqbal, Eduardo Montijano, Carlos Sagüés, Nikolay Atanasov

发表年份: 2025
访问权限: 开放获取

摘要

Multi-agent games in dynamic nonlinear settings are challenging due to the time-varying interactions among the agents and the non-stationarity of the (potential) Nash equilibria. In this paper we consider model-free games, where agent transitions and costs are observed without knowledge of the transition and cost functions that generate them. We propose a policy gradient approach to learn distributed policies that follow the communication structure in multi-team games, with multiple agents per team. Our formulation is inspired by the structure of distributed policies in linear quadratic games, which take the form of time-varying linear feedback gains. In the nonlinear case, we model the policies as nonlinear feedback gains, parameterized by self-attention layers to account for the time-varying multi-agent communication topology. We demonstrate that our distributed policy gradient approach achieves strong performance in several settings, including distributed linear and nonlinear regulation, and simulated and real multi-robot pursuit-and-evasion games.

关键词

eess.SYcs.MAcs.RO

Policy Gradient with Self-Attention for Model-Free Distributed Nonlinear Multi-Agent Games

摘要

关键词

相关论文

A new optimizer using particle swarm theory

Swarm Intelligence

Design and use paradigms for gazebo, an open-source multi-robot simulator

Swarm robotics: a review from the swarm engineering perspective