Home /Research /Synthesis of model predictive control and reinforcement learning: Survey and classification

LEARNING

Synthesis of model predictive control and reinforcement learning: Survey and classification

Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt, Florian Messerer, Katrin Baumgärtner, Shambhuraj Sawant, Joschka Bödecker, Moritz Diehl, Sébastien Gros

Year: 2026
Citations: 5

Abstract

Model predictive control (MPC) and reinforcement learning (RL) are two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and different requirements. Various technical discrepancies, particularly the role of an environment model as part of the algorithm, lead to methodologies with nearly complementary advantages. Due to their orthogonal benefits, research interest in combination methods has recently increased significantly, resulting in a large and growing set of complex ideas that leverage MPC and RL. This work illuminates the differences, similarities, and fundamentals that enable various combination algorithms and categorizes existing work accordingly. Particularly, we focus on the versatile actor–critic RL approach as a basis for our categorization and examine how the online optimization approach of MPC can be used to improve the overall closed-loop performance of a policy.

Keywords

Reinforcement learningModel predictive controlLeverage (statistics)Markov decision processCategorizationProcess (computing)Set (abstract data type)

Synthesis of model predictive control and reinforcement learning: Survey and classification

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

The Organization of Behavior

The spread of true and false news online