Integrating RL and Behavior-Based Control for Soccer
Tucker Balch
- 发表年份
- 1997
- 引用次数
- 4
摘要
This paper describes Clay, an evolutionary architecture for autonomous robots integrating motor schema-based control and reinforcement learning. Robots utilizing this system benefit from the real-time performance of motor schemas in continuous and dynamic environments while taking advantage of adaptive reinforcement learning. Clay coordinates assemblages (groups of motor schemas) using embedded reinforcement learning modules. The coordination modules activate specific assemblages based on the presently perceived situation. Learning occurs as the robot selects assemblages and samples a reinforcement signal over time. Clay was used by Georgia Tech in the configuration of a soccer team for the RoboCup-97 simulator competition [ Kitano et al., 1997 ] . A simple robot soccer strategy is used to to illustrate the utility of the system. Motor schemas are the reactive component of Arkin's Autonomous Robot Architecture (AuRA) [ Arkin and Balch, 1997 ] . AuRA's design integrates deliberative planning at a top level with behavior-based motor control at the bottom. The lower levels, concerned with executing the reactive behaviors are incorporated in this research. Individual motor schemas, or primitive behaviors, express separate goals or constraints for a task. As an example, important schemas for a navigational task would include avoid obstacles and move to goal. Since schemas are independent, they can run concurrently, providing parallelism and speed. Sensor input is processed by perceptual schemas embedded in the motor behaviors. Perceptual processing is minimal and provides just the information pertinent to the motor schema. For instance, a find obstacles perceptual schema which provides a list of sensed obstacles is embedded in the
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002