首页 /研究 /Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
LEARNING

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

Yaakov Engel, Peter Szabo, Dmitry Volkinshtein

发表年份
2005
引用次数
57

摘要

The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTD-based algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.

关键词

octopus (software)Gaussian processRobotic armReinforcement learningTemporal difference learningComputer scienceArtificial intelligenceProcess (computing)Domain (mathematical analysis)Bayesian optimization

相关论文

查看 LEARNING 分类全部论文