首页 /研究 /Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

LEARNING

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

Yaakov Engel, Peter Szabo, Dmitry Volkinshtein

发表年份: 2005
引用次数: 57

摘要

The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTD-based algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.

关键词

octopus (software)Gaussian processRobotic armReinforcement learningTemporal difference learningComputer scienceArtificial intelligenceProcess (computing)Domain (mathematical analysis)Bayesian optimization

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory