Direct Data-Driven Linear Quadratic Tracking via Policy Optimization
Shubo Kang, Keyou You
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Direct data-driven optimal control provides an elegant end-to-end paradigm, yet its real-time applicability is often hindered by the growing dimensionality of online decision variables. Recent breakthroughs, notably Data-EnablEd Policy Optimization (DeePO), overcome this bottleneck for the Linear Quadratic Regulator (LQR) through sample-covariance parameterization; however, extending this paradigm to Linear Quadratic Tracking (LQT) poses a fundamental challenge. The core difficulty stems from the intricate coupling between time-varying references and the feedback-feedforward policy structure, which prevents a direct application of constant-dimension parameterization. We first introduce a reference-decoupled reformulation of LQT that naturally accommodates the covariance parameterization, guaranteeing a fixed dimension of decision variables independent of data horizon. This formulation is proven to be exactly equivalent to the indirect certainty-equivalence LQT solution. Leveraging this characterization, we develop offline and online DeePO algorithms. Theoretically, we prove global linear convergence for the offline algorithm using local gradient dominance and smoothness, and show that in the online setting the optimality gap decays linearly up to a bias term that scales inversely with the signal-to-noise ratio (SNR). Numerical simulations varify the theoretical results and illustrate the superior tracking performance of the proposed method.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
Genetic Programming: On the Programming of Computers by Means of Natural Selection
John R. Koza
1992