Home /Research /Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
LEARNING

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

Yaakov Engel, Peter Szabo, Dmitry Volkinshtein

Year
2005
Citations
57

Abstract

The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTD-based algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.

Keywords

octopus (software)Gaussian processRobotic armReinforcement learningTemporal difference learningComputer scienceArtificial intelligenceProcess (computing)Domain (mathematical analysis)Bayesian optimization

Related papers

Browse all LEARNING papers