Integrating Reason-Based Moral Decision-Making in the Reinforcement Learning Architecture
Lisa Dargasz
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
Reinforcement Learning is a machine learning methodology that has demonstrated strong performance across a variety of tasks. In particular, it plays a central role in the development of artificial autonomous agents. As these agents become increasingly capable, market readiness is rapidly approaching, which means those agents, for example taking the form of humanoid robots or autonomous cars, are poised to transition from laboratory prototypes to autonomous operation in real-world environments. This transition raises concerns leading to specific requirements for these systems - among them, the requirement that they are designed to behave ethically. Crucially, research directed toward building agents that fulfill the requirement to behave ethically - referred to as artificial moral agents(AMAs) - has to address a range of challenges at the intersection of computer science and philosophy. This study explores the development of reason-based artificial moral agents (RBAMAs). RBAMAs are build on an extension of the reinforcement learning architecture to enable moral decision-making based on sound normative reasoning, which is achieved by equipping the agent with the capacity to learn a reason-theory - a theory which enables it to process morally relevant propositions to derive moral obligations - through case-based feedback. They are designed such that they adapt their behavior to ensure conformance to these obligations while they pursue their designated tasks. These features contribute to the moral justifiability of the their actions, their moral robustness, and their moral trustworthiness, which proposes the extended architecture as a concrete and deployable framework for the development of AMAs that fulfills key ethical desiderata. This study presents a first implementation of an RBAMA and demonstrates the potential of RBAMAs in initial experiments.
关键词
相关论文
The Organization of Behavior
D. O. Hebb
2005
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi 等 10 位作者
2021
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar 等 10 位作者
2018