Approximate Planning in POMDPs with Macro-Actions

Georgios Theocharous, Leslie Pack Kaelbling

发表年份: 2003
引用次数: 68

摘要

Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief space, which uses macro-actions and Monte Carlo updates of the Q-values. We apply the algorithm to a large scale robot navigation task and demonstrate that with temporal abstraction we can consider an even smaller part of the belief space, we can learn POMDP policies faster, and we can do information gathering more efficiently. 1

关键词

AbstractionComputer scienceReinforcement learningMacroPartially observable Markov decision processSpace (punctuation)GridTask (project management)Artificial intelligenceMarkov decision process

Approximate Planning in POMDPs with Macro-Actions

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory