Synthesis of model predictive control and reinforcement learning: Survey and classification
Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt, Florian Messerer, Katrin Baumgärtner, Shambhuraj Sawant, Joschka Bödecker, Moritz Diehl, Sébastien Gros
- 发表年份
- 2026
- 引用次数
- 5
摘要
Model predictive control (MPC) and reinforcement learning (RL) are two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and different requirements. Various technical discrepancies, particularly the role of an environment model as part of the algorithm, lead to methodologies with nearly complementary advantages. Due to their orthogonal benefits, research interest in combination methods has recently increased significantly, resulting in a large and growing set of complex ideas that leverage MPC and RL. This work illuminates the differences, similarities, and fundamentals that enable various combination algorithms and categorizes existing work accordingly. Particularly, we focus on the versatile actor–critic RL approach as a basis for our categorization and examine how the online optimization approach of MPC can be used to improve the overall closed-loop performance of a policy.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
The Organization of Behavior
D. O. Hebb
2005
The spread of true and false news online
Soroush Vosoughi, Deb Roy, Sinan Aral
2018