首页 /研究 /Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur
LEARNING

Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur

Rémi Coulom

发表年份
2002
引用次数
5

摘要

This thesis is a study of practical methods to estimate value functions with feedforward neural networks in model-based reinforcement learning. Focus is placed on problems in continuous time and space, such as motor-control tasks. In this work, the continuous TD(lambda) algorithm is refined to handle situations with discontinuous states and controls, and the vario-eta algorithm is proposed as a simple but efficient method to perform gradient descent. The main contributions of this thesis are experimental successes that clearly indicate the potential of feedforward neural networks to estimate high-dimensional value functions. Linear function approximators have been often preferred in reinforcement learning, but successful value function estimations in previous works are restricted to mechanical systems with very few degrees of freedom. The method presented in this thesis was tested successfully on an original task of learning to swim by a simulated articulated robot, with 4 control variables and 12 independent state variables, which is significantly more complex than problems that have been solved with linear function approximators so far.

关键词

HumanitiesPhysicsPhilosophy

相关论文

查看 LEARNING 分类全部论文