首页 /研究 /Q-PSP Learning: An Exploitation-Oriented Q-Learning Algorithm and Its Applications

LEARNING

Q-PSP Learning: An Exploitation-Oriented Q-Learning Algorithm and Its Applications

Tadashi Horiuchi, Akinori Fujino, Osamu Katai, Tetsuo Sawaragi

发表年份: 1999
引用次数: 18
访问权限: 开放获取

摘要

Reinforcement learning alogrithms can be classified into two approaches. One is “exploitation-oriented” approach which attempts to acquire action rules mainly by reinforcing and relying on good experiences, and the other is “exploration-oriented” approach which pursuits the optimality of actions to receive highest rewards by exploring the environment. In this paper, we propose Q-PSP Learning method which incorporates the the idea of PSP (Profit Sharing Plan) used in Classifier System as “exploitation-oriented” reinforcement learning into Q-Learning as “exploration-oriented” reinforcement learning in order to take the merits of these two approaches. Through applying the Q-PSP Learning to several control problems and a robot navigation problem, it will be shown that not only the speed up of learning but also effectiveness for complex problems can be expected and that an appropriate balance between exploration and exploitation can be attained in Q-PSP Learning.

关键词

Reinforcement learningLearning classifier systemComputer scienceArtificial intelligenceQ-learningMachine learningRobot learningProfit sharingError-driven learningUnsupervised learning

Q-PSP Learning: An Exploitation-Oriented Q-Learning Algorithm and Its Applications

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory