首页 /研究 /Approximate Discounted Dynamic Programming Is Unreliable
LEARNING

Approximate Discounted Dynamic Programming Is Unreliable

Matthew A. McDonald, Philip Hingston

发表年份
1994
引用次数
6

摘要

Popular reinforcement learning methods that employ generalising function approximators perform poorly in many domains. We analyse effects of approximation error in domains with sparse rewards, revealing the extent of scaling difficulties. Empirical evidence is presented that suggests when problems are likely to occur and explains some of the widely differing results reported in the literature. Keywords Reinforcement learning, dynamic programming, function approximation, induction, problem solving CR categories I.2.6, I.2.8 * The Robotics and Vision Research Group acknowledges the support received from Digital through their External Research Programme. Department of Computer Science Approximate Discounted Dynamic Programming Is Unreliable McDonald, Hingston -- Page 1 1 Introduction Most domains studied in AI are too large to be searched exhaustively. It is widely believed that reinforcement learning methods must be combined with generalising function approximators in order to sc...

关键词

Computer scienceMathematicsMathematical optimization

相关论文

查看 LEARNING 分类全部论文