Safety-Critical Reinforcement Learning with Viability-Based Action Shielding for Hypersonic Longitudinal Flight
Hossein Rastgoftar
- Year
- 2026
- Access
- Open access
Abstract
This paper presents a safety-critical reinforcement learning framework for nonlinear dynamical systems with continuous state and input spaces operating under explicit physical constraints. Hard safety constraints are enforced independently of the reward through action shielding and reachability-based admissible action sets, ensuring that unsafe behaviors are never intentionally selected during learning or execution. To capture nominal operation and recovery behavior within a single control architecture, the state space is partitioned into safe and unsafe regions based on membership in a safety box, and a mode-dependent reward is used to promote accurate tracking inside the safe region and recovery toward it when operating outside. To enable online tabular learning on continuous dynamics, a finite-state abstraction is constructed via state aggregation, and action selection and value updates are consistently restricted to admissible actions. The framework is demonstrated on a longitudinal point-mass hypersonic vehicle model with aerodynamic and propulsion couplings, using angle of attack and throttle as control inputs.
Keywords
Related papers
The Organization of Behavior
D. O. Hebb
2005
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi +7 more
2021
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar +7 more
2018