Convergence Analysis of Reinforcement Learning Approaches to Humanoid Locomotion
Önder Tutsoy, Martin Brown
- 发表年份
- 2010
- 引用次数
- 6
摘要
Sophisticated intelligent machines such as humanoid robots require the ability to interact with the environment and hence efficiently adapt their behavior. Therefore, robots must be equipped with the ability to modify and add to its knowledge base using information gained from its past behaviour, such as stable, robust walking on unseen terrains. Currently, designing humanoid robots with advanced learning and cognitive capabilities is one of the most challenging issues in the field of intelligent robotics. The iCub and its newer version, the C-Cub, were developed as test beds for evaluating how cognitive and learning approaches can operate safely in unstructured environments. This paper describes preliminary work on evaluating the convergence of a variety of temporal difference learning algorithms, and comparing the results of each learning algorithm based on a simulation of a simple inverted pendulum in order to visualize the value and control action functions. It will be clearly showed that the learning performance of TD(λ) is significantly better than the TD(0) and stochastic gradient algorithm (SGA) based learning.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002