Home /Research /Reinforcement Learning Algorithms in Humanoid Robotics
LOCOMOTION

Reinforcement Learning Algorithms in Humanoid Robotics

Duško Katić, Miomir Vukobratović

Year
2007
Citations
3
Access
Open access

Abstract

This study considers a optimal solutions for application of reinforcement learning in humanoid robotics Humanoid Robotics is a very challenging domain for reinforcement learning, Reinforcement learning control algorithms represents general framework to take traditional robotics towards true autonomy and versatility. The reinforcement learning paradigm described above has been successfully implemented for some special type of humanoid robots in the last 10 years. Reinforcement learning is well suited to training biped walk in particular teaching a robot a new behavior from scalar or fuzzy feedback. The general goal in synthesis of reinforcement learning control algorithms is the development of methods which scale into the dimensionality of humanoid robots and can generate actions for biped with many degrees of freedom. In this study, control of walking of active and passive dynamic walkers by using of reinforcement learning was amalyzed. Various straightforward and hybrid intelligent control algorithms based RL for active and passive biped locomotion is presented. The proposed RL algorithms use the learning elements that consists of various types of neural networks, fuzzy logic nets or fuzzy-neuro networks with focus on fast convergence properties and small number of learning trials. Special part of study represents synthesis of hybrid intelligent controllers for biped walking. The hybrid aspect is connected with application of model-based and model free approaches as well as with combination of different paradigms of computational intelligence. These algorithms includes combination of a dynamic controller based on dynamic model and special compensators based on reinforcement structures. Two different reinforcement learning structures were proposed based on actor-critic approach and Q-learning. The algorithms is based on fuzzy evaluative feedback that are obtained from human intuitive balancing knowledge. The reinforcement learning with fuzzy evaluation feedback is much closer to the human biped walking evaluation than the original one with scalar feedback.

Keywords

Humanoid robotRobotRoboticsArtificial intelligenceHuman–computer interactionField (mathematics)Computer scienceEngineeringSimulation

Related papers

Browse all LOCOMOTION papers