An Embodied Model of Learning, Plasticity, and Reward
William H. Alexander, Olaf Sporns
- 发表年份
- 2002
- 引用次数
- 32
摘要
We describe and discuss a neural network model of the dopaminergic system based on observed anatomical and physiological properties of the primate midbrain. The model relies on value-dependent synaptic modification to acquire temporal information regarding reward-related events and the stimuli with which such events are paired. Experience-dependent changes in synaptic plasticity allow the model to generate neuromodulatory responses corresponding to prediction errors. These phasic neural responses act as a value signal with positive and negative components, representing the unpredicted occurrence of rewarding stimuli and the omission of an expected reward, respectively. The value signal modulates widespread synaptic changes, including afferent connections of the value system itself. The model is embedded in an autonomous robot, and its behavior is tested as changes are applied to the robot's motor characteristics and as the stimulus content of the environment is varied. We observe the development of the system as a consequence of environmental stimuli and autonomous movement, leading to the conditioning of reward-related behaviors through the interaction between the robot and its surroundings.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002