Testing AI's Implicit World Models
Keyon Vafa
Harvard University
(hosted by Krishna Gummadi)
(hosted by Krishna Gummadi)
05 Mar 2026, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
CIS@MPG Colloquium
Real-world AI systems must be robust across a wide range of conditions. One
path to such robustness is if a model recovers a coherent
structural understanding of its domain. But it is unclear how to measure, or
even define, structural understanding. This talk will present
theoretically-grounded definitions and metrics that test the structural
recovery — or implicit "world models" — of generative models. We will
propose different ways to formalize the concept of a world model, ...
Real-world AI systems must be robust across a wide range of conditions. One
path to such robustness is if a model recovers a coherent
structural understanding of its domain. But it is unclear how to measure, or
even define, structural understanding. This talk will present
theoretically-grounded definitions and metrics that test the structural
recovery — or implicit "world models" — of generative models. We will
propose different ways to formalize the concept of a world model, develop tests
based on these notions, and apply them across domains. In
applications ranging from testing whether LLMs apply logic to whether
foundation models acquire Newtonian mechanics, we will see that
models can make highly accurate predictions with incoherent world models. We
will also connect these tests to a broader agenda of building
generative models that are robust across downstream uses, incorporating ideas
from statistics and the behavioral sciences. Developing
reliable inferences about model behavior across tasks offer new ways to assess
and improve the efficacy of generative models.
Read more