This paper presents a principal-agent model for aligning artificial intelligence (AI) behaviors with human ethicalmobjectives. In this framework, the end-user acts as the principal, offering a contract to the system developer (the agent) that specifies desired levels of ethical alignment for the AI system. The developer can exercise varying levelsmof effort to achieve this alignment, with higher levels - such as those required in Constitutional AI - demanding more effort and posing greater challenges. To incentivize the developer to invest more effort in aligning AI with higher ethical principles, appropriate compensation is necessary. When ethical alignment is unobservable and the developer is risk-neutral, the optimal contract achieves the same alignment and expected utilities as when it is observable. For observable alignment, a fixed reward is uniquely optimal for strictly risk-averse developers, while for risk-neutral developers, it remains one of several optimal solutions. This simple model demonstrates thatmbalancing responsibility between users and developers is crucial for fostering ethical AI. Users seeking higher ethical alignment must not only compensate developers adequately but also adhere to design specifications and regulations to ensure the system’s ethical integrity.

Ethical AI Systems and Shared Accountability: The Role of Economic Incentives in Fairness and Explainability

Caterina Giannetti;Yoo Dae-Hyun
2024-01-01

Abstract

This paper presents a principal-agent model for aligning artificial intelligence (AI) behaviors with human ethicalmobjectives. In this framework, the end-user acts as the principal, offering a contract to the system developer (the agent) that specifies desired levels of ethical alignment for the AI system. The developer can exercise varying levelsmof effort to achieve this alignment, with higher levels - such as those required in Constitutional AI - demanding more effort and posing greater challenges. To incentivize the developer to invest more effort in aligning AI with higher ethical principles, appropriate compensation is necessary. When ethical alignment is unobservable and the developer is risk-neutral, the optimal contract achieves the same alignment and expected utilities as when it is observable. For observable alignment, a fixed reward is uniquely optimal for strictly risk-averse developers, while for risk-neutral developers, it remains one of several optimal solutions. This simple model demonstrates thatmbalancing responsibility between users and developers is crucial for fostering ethical AI. Users seeking higher ethical alignment must not only compensate developers adequately but also adhere to design specifications and regulations to ensure the system’s ethical integrity.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1288593
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact