Home /Research /Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

LEARNING

Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

Yutaka Nakamura, Takeshi Mori, Yoichi Tokita, Tomohiro Shibata, Shin Ishii

Year: 2005
Citations: 3

Abstract

Referring to the mechanism of animals’ rhythmic movements, motor control schemes using a central pattern generator (CPG) controller have been studied. We previously proposed reinforcement learning (RL) called the CPG-actor-critic model, as an autonomous learning framework for a CPG controller. Here, we propose an off-policy natural policy gradient RL algorithm for the CPG-actor-critic model, to solve the “exploration-exploitation” problem by meta-controlling “behavior policy.” We apply this RL algorithm to an automatic control problem using a biped robot simulator. Computer simulation demonstrated that the CPG controller enables the biped robot to walk stably and efficiently based on our new algorithm.

Keywords

Central pattern generatorController (irrigation)Reinforcement learningComputer scienceControl theory (sociology)Biped robotRobotCpG siteMechanism (biology)Control engineering

Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory