Abstract
Generative AI systems produce potential ethical and social risks that require comprehensive evaluation. In this paper, we propose a three-layered framework toward a structured, sociotechnical approach to evaluate these risks. The approach here evolves from a system safety perspective, particularly the insight that context determines whether a given capability causes harm. Applied to AI, this perspective shows that evaluation at multiple layers of analysis is required in order to assess whether or not an AI system may cause a given harm. Sociotechnical evaluation begins with the main current approach to safety evaluation, capability testing, which considers AI system outputs and their components in isolation. Building on this, it accounts for two further layers of context. The second layer centres the human interacting with a system to assess harm that may occur at the point of use. Broader systemic impacts on the structures into which an AI system is embedded require evaluation at the third, systemic evaluation layer. The second main contribution of this paper is an overview of the current state of sociotechnical evaluation. Reviewing all available evaluations for social and ethical risks known to a wide range of cross-industry researchers, we map gaps in the status quo. We then carefully lay out a roadmap of tractable steps toward closing the identified gaps.
Authors
Laura Weidinger, Nahema Marchal, Maribeth Rauh, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, William Isaac
Venue
arXiv