Convergence and stability of Q-learning in Hierarchical Reinforcement Learning
Massimiliano Manenti, Andrea Iannelli
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
Hierarchical Reinforcement Learning promises, among other benefits, to efficiently capture and utilize the temporal structure of a decision-making problem and to enhance continual learning capabilities, but theoretical guarantees lag behind practice. In this paper, we propose a Feudal Q-learning scheme and investigate under which conditions its coupled updates converge and are stable. By leveraging the theory of Stochastic Approximation and the ODE method, we present a theorem stating the convergence and stability properties of Feudal Q-learning. This provides a principled convergence and stability analysis tailored to Feudal RL. Moreover, we show that the updates converge to a point that can be interpreted as an equilibrium of a suitably defined game, opening the door to game-theoretic approaches to Hierarchical RL. Lastly, experiments based on the Feudal Q-learning algorithm support the outcomes anticipated by theory.
关键词
相关论文
The Organization of Behavior
D. O. Hebb
2005
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi 等 10 位作者
2021
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar 等 10 位作者
2018