"Balance" by Arthur B. Davies (1898). Courtesy of Smithsonian Open Access.
Noyce Conference Room
Studio

All day

 

Our campus is closed to the public for this event.

AI evaluation is increasingly relied upon, shaping AI system development, model selection, and informing public policy and regulation. In order to perform these functions well, a well-grounded and principled approach to AI measurement and assessment is needed. 

This studio brings together a broad group of practitioners, scientists, and theorists to lay the foundations toward an evaluation science of generative AI systems. We aim to use complexity science and metrology as lenses to shape a more scientifically-grounded approach to measuring AI systems. Drawing together expertise from behavioral, social and computer sciences, as well as the worlds of practice and policy, this workshop seeks to advance and standardize approaches to AI evaluation from related fields. We will also identify in what ways AI presents unique measurement challenges, and explore how despite the open-endedness and complexity of these systems, evaluation can be made tractable. Our goal is to chart a methodological toolkit and process toward more robust measurements of AI systems, their capabilities, interactions and impacts. 

This Studio is made possible by generous grants from the Siegel Family Endowment and the Omidyar Network.

Organizers

William IsaacWilliam IsaacPrincipal Scientist at Google DeepMind
Kristian LumKristian LumResearch Scientist at Google DeepMind
William TracyWilliam TracyVice President for Applied Complexity, SFI
Laura WeidingerLaura WeidingerStaff Research Scientist at Google DeepMind

More SFI Events