首页 /研究 /Making reinforcement learning work on real robots
LEARNING

Making reinforcement learning work on real robots

Leslie Pack Kaelbling, William D. Smart

发表年份
2002
引用次数
93

摘要

Programming robots is hard. It often takes a great deal of time to fine-tune the many parameters in a typical control algorithm. For some robot tasks, we may not even know a good solution without extensive experimentation. Even when we, as humans, have good intuitions about how to perform a given task, it is often difficult to translate these into the sensor and actuator spaces of the robot. Having the robot learn how to perform a given task is one way of addressing these problems. Specifying what the robot should be doing, and allowing it to fill in the details of how using learning is an appealing idea. In general, describing a task at a higher, more behavioral level is easier for humans than having to specifying the exact mapping from sensors to actuators that defines a control policy. In particular, reinforcement learning is a very promising paradigm for learning on real robots. However, simply applying existing reinforcement learning techniques will almost certainly lead to failure. Issues such as large, continuous state and action spaces, extremely limited amounts of training data, lack of initial knowledge about the task and environment, and the necessity of keeping the robot physically safe during learning must be explicitly addressed if learning is to succeed. In this dissertation, we identify some of the problems that must be overcome when attempting to implement a reinforcement learning system on a real mobile robot. We discuss some solutions to these problems and present two components that, together, allow us to use reinforcement learning techniques effectively on a real robot. HEDGER is a safe value-function approximation algorithm designed to be used with continuous state and action spaces, and with sparse reward functions. JAQL is our general framework for reinforcement learning on real robots, and deals with the problems of initial knowledge and robot safety. We validate the effectiveness of both components using a variety of simulated and real robot task domains.

关键词

Reinforcement learningRobotTask (project management)Computer scienceRobot learningArtificial intelligenceProgramming by demonstrationAction (physics)Human–computer interactionControl (management)

相关论文

查看 LEARNING 分类全部论文