Home /Research /Integrating LTL Constraints into PPO for Safe Reinforcement Learning

LEARNING

Integrating LTL Constraints into PPO for Safe Reinforcement Learning

Maifang Zhang, Hang Yu, Qian Zuo, Cheng Wang, Vaishak Belle, Fengxiang He

Year: 2026
Access: Open access

Abstract

This paper proposes Proximal Policy Optimization with Linear Temporal Logic Constraints (PPO-LTL), a framework that integrates safety constraints written in LTL into PPO for safe reinforcement learning. LTL constraints offer rigorous representations of complex safety requirements, such as regulations that broadly exist in robotics, enabling systematic monitoring of safety requirements. Violations against LTL constraints are monitored by limit-deterministic Büchi automata, and then translated by a logic-to-cost mechanism into penalty signals. The signals are further employed for guiding the policy optimization via the Lagrangian scheme. Extensive experiments on the Zones and CARLA environments show that our PPO-LTL can consistently reduce safety violations, while maintaining competitive performance, against the state-of-the-art methods. The code is at https://github.com/EVIEHub/PPO-LTL.

Keywords

cs.LGcs.AIcs.LOcs.RO

Integrating LTL Constraints into PPO for Safe Reinforcement Learning

Abstract

Keywords

Related papers

The Organization of Behavior

Fractional Brownian Motions, Fractional Noises and Applications

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A guide to deep learning in healthcare