Model-free LQG Control with Chance Constraints
Arunava Naha, Subhrakanti Dey
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
This paper studies model-free optimal control design and its convergence properties for linear time-invariant systems subject to probabilistic risk or chance constraints. In particular, we study a natural policy gradient (NPG)-based actor-critic (AC) algorithm with two timescales, using a Lagrangian primal-dual framework to enforce the constraint. Furthermore, the risk is defined as the probability that a function of the one-step-ahead state exceeds a user-specified threshold. To our knowledge, this is the first work to study the analytical convergence properties for NPG-based AC in a chance-constrained linear-quadratic Gaussian (LQG) regulator setting without model knowledge. We establish the coercivity and gradient dominance properties of the Lagrangian function, which ensure linear convergence and closed-loop stability during training for the actor. On the other hand, we analyse the convergence properties of the temporal difference (TD(0)) learning for the critic, applying stochastic approximation theory. Also, we demonstrate no duality gap in the constrained optimisation problem. Additionally, we have performed numerical analysis of the convergence properties and accuracy of the proposed method, comparing it with model-based chance-constrained LQR and scenario-based MPC. Results show that our approach effectively limits risk while maintaining near-optimal performance, without requiring full model knowledge or real-time optimisation.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
Genetic Programming: On the Programming of Computers by Means of Natural Selection
John R. Koza
1992