首页 /研究 /Convergence Analysis of Reinforcement Learning Approaches to Humanoid Locomotion
LOCOMOTION

Convergence Analysis of Reinforcement Learning Approaches to Humanoid Locomotion

Önder Tutsoy, Martin Brown

发表年份
2010
引用次数
6

摘要

Sophisticated intelligent machines such as humanoid robots require the ability to interact with the environment and hence efficiently adapt their behavior. Therefore, robots must be equipped with the ability to modify and add to its knowledge base using information gained from its past behaviour, such as stable, robust walking on unseen terrains. Currently, designing humanoid robots with advanced learning and cognitive capabilities is one of the most challenging issues in the field of intelligent robotics. The iCub and its newer version, the C-Cub, were developed as test beds for evaluating how cognitive and learning approaches can operate safely in unstructured environments. This paper describes preliminary work on evaluating the convergence of a variety of temporal difference learning algorithms, and comparing the results of each learning algorithm based on a simulation of a simple inverted pendulum in order to visualize the value and control action functions. It will be clearly showed that the learning performance of TD(λ) is significantly better than the TD(0) and stochastic gradient algorithm (SGA) based learning.

关键词

iCubHumanoid robotComputer scienceReinforcement learningArtificial intelligenceInverted pendulumRobot learningRobotConvergence (economics)Terrain

相关论文

查看 LOCOMOTION 分类全部论文