首页 /研究 /Continuous reinforcement learning via advantage value difference reward shaping: A proximal policy optimization perspective
LEARNING

Continuous reinforcement learning via advantage value difference reward shaping: A proximal policy optimization perspective

Jiawei Lin, Xuekai Wei, Weizhi Xian, Jielu Yan, Leong Hou U, Zhaowei Shang, Mingliang Zhou

发表年份
2025
引用次数
6

关键词

Computer scienceReinforcement learningPerspective (graphical)Temporal difference learningValue (mathematics)ReinforcementArtificial intelligenceMathematical optimizationMachine learningSocial psychology

相关论文

查看 LEARNING 分类全部论文