Events 2026

Pushing the Boundary on Automated Modular Floating-Point Verification

Rosa Abbasi Max Planck Institute for Software Systems
27 Mar 2026, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
SWS Student Defense Talks - Thesis Defense
Floating-point numbers often represent real numbers in computer systems. They are applicable in many domains, including embedded systems, machine learning, and scientific computing. Despite their widespread use, they pose some difficulties. Floating-point numbers and operations typically suffer from roundoff errors, making computations over floating-points inaccurate with respect to a real-valued specification. Moreover, the IEEE 754 floating-point standard, a fundamental element in formalizing floating-point arithmetic for today’s computers, presents additional challenges due to special values and resulting unintuitive behaviors. ...
Floating-point numbers often represent real numbers in computer systems. They are applicable in many domains, including embedded systems, machine learning, and scientific computing. Despite their widespread use, they pose some difficulties. Floating-point numbers and operations typically suffer from roundoff errors, making computations over floating-points inaccurate with respect to a real-valued specification. Moreover, the IEEE 754 floating-point standard, a fundamental element in formalizing floating-point arithmetic for today’s computers, presents additional challenges due to special values and resulting unintuitive behaviors. This thesis has three main contributions that address existing gaps in automated reasoning about floating-point arithmetic, making it easier for developers and researchers to understand, verify, and trust the floating-point computations in their programs. First, we introduce the first floating-point support in a deductive verifier for the Java programming language. Our support in the KeY verifier automatically handles floating-point arithmetic and transcendental functions. We achieve this with a combination of delegation to external SMT solvers on one hand, and rule-based reasoning within KeY on the other, exploiting the complementary strengths of both approaches. As a result, this approach can prove functional floating-point properties for realistic programs. Second, inspired by KeY’s treatment of method calls and the need for a scalable roundoff error analysis, we present the first modular optimization-based roundoff error analysis for non-recursive procedural floating-point programs. Our key idea is to achieve modularity while maintaining reasonable accuracy by automatically computing procedure summaries that are a function of the input parameters. Technically, we extend an existing optimization-based roundoff error analysis and show how to effectively use first-order Taylor approximations to compute precise procedure summaries, and how to integrate those to obtain end-to-end roundoff error bounds. Third, our experience using SMT solvers to discharge KeY’s floating-point verification conditions revealed unexpected performance behavior, motivating a systematic study of floating-point reasoning in SMT solvers. We propose a metamorphic testing approach that uses semantics-preserving rewrite rules, focusing on floating-point special values, to uncover unexpected performance behavior in SMT solvers’ handling of floating-point formulas, such as an increase in solving time when the SMT queries are simplified. Using real-world test inputs, our approach can identify such performance bugs for every SMT solver tested.
Read more

AI Ecosystems: Structure, Strategy, Risk and Regulation

Benjamin Laufer Cornell Tech
(hosted by Manuel Gomez Rodriguez)
13 Mar 2026, 12:07 pm - 1:07 pm
Kaiserslautern building G26, room 111
CIS@MPG Colloquium
Machine learning (ML) and artificial intelligence (AI) are not standalone artifacts: they are ecosystems where foundation models are adapted and deployed through layered pipelines spanning developers, platforms, users and regulators. This talk explores how the structure of these ecosystems shapes the distribution of value and risk, and determines system-level properties like safety and fairness. I begin with a game-theoretic model of the interaction between general-purpose producers and domain specialists, using it to examine how regulatory design shapes incentives and equilibrium behaviors. ...
Machine learning (ML) and artificial intelligence (AI) are not standalone artifacts: they are ecosystems where foundation models are adapted and deployed through layered pipelines spanning developers, platforms, users and regulators. This talk explores how the structure of these ecosystems shapes the distribution of value and risk, and determines system-level properties like safety and fairness. I begin with a game-theoretic model of the interaction between general-purpose producers and domain specialists, using it to examine how regulatory design shapes incentives and equilibrium behaviors. I then connect these formal insights to empirical measurements from 1.86 million open-source AI models, reconstructing lineage networks to quantify how behaviors and failures propagate through fine-tuning. Finally, zooming in from the aggregate structure of the ecosystem to the design of the algorithms themselves, I describe my work in algorithmic fairness, framing the identification of less discriminatory algorithms as a search problem with provable statistical guarantees. I close by outlining a forward-looking research agenda aimed at building both the technical infrastructure and policy mechanisms required to steer AI ecosystems toward robust, accountable and democratic outcomes.
Read more

Testing AI's Implicit World Models

Keyon Vafa Harvard University
(hosted by Krishna Gummadi)
05 Mar 2026, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
CIS@MPG Colloquium
Real-world AI systems must be robust across a wide range of conditions. One path to such robustness is if a model recovers a coherent structural understanding of its domain. But it is unclear how to measure, or even define, structural understanding. This talk will present theoretically-grounded definitions and metrics that test the structural recovery — or implicit "world models" — of generative models. We will propose different ways to formalize the concept of a world model, ...
Real-world AI systems must be robust across a wide range of conditions. One path to such robustness is if a model recovers a coherent structural understanding of its domain. But it is unclear how to measure, or even define, structural understanding. This talk will present theoretically-grounded definitions and metrics that test the structural recovery — or implicit "world models" — of generative models. We will propose different ways to formalize the concept of a world model, develop tests based on these notions, and apply them across domains. In applications ranging from testing whether LLMs apply logic to whether foundation models acquire Newtonian mechanics, we will see that models can make highly accurate predictions with incoherent world models. We will also connect these tests to a broader agenda of building generative models that are robust across downstream uses, incorporating ideas from statistics and the behavioral sciences. Developing reliable inferences about model behavior across tasks offer new ways to assess and improve the efficacy of generative models.
Read more

Infinite Groups: An Interplay of Automata, Logic, and Algebra

Ruiwen Dong University of Oxford
(hosted by Joël Ouaknine)
03 Mar 2026, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 029
CIS@MPG Colloquium
Algorithmic problems in infinite groups arise naturally in the analysis of computational models, and they forge a fundamental link between algebra, logic, and computation. The central aim of my research is to understand the boundary between decidability and undecidability in computational algebra and group theory. Despite significant advances in the past 80 years, essential questions on the nature of these boundaries remain largely unanswered. The decision problems that we consider (the Diophantine Problem, membership problems, intersection problems) are particularly relevant to algorithmic verification, ...
Algorithmic problems in infinite groups arise naturally in the analysis of computational models, and they forge a fundamental link between algebra, logic, and computation. The central aim of my research is to understand the boundary between decidability and undecidability in computational algebra and group theory. Despite significant advances in the past 80 years, essential questions on the nature of these boundaries remain largely unanswered. The decision problems that we consider (the Diophantine Problem, membership problems, intersection problems) are particularly relevant to algorithmic verification, automata theory, and program analysis. Our methodology relates algorithms in groups to other central problems in infinite state systems, first-order arithmetic theories, and computer algebra. In this talk, I will illustrate a few recent results that expand the known decidability frontier, as well as some immediate and impactful open problems.
Read more

Security: A Next Frontier in AI Coding

Jingxuan He UC Berkeley
(hosted by Thorsten Holz)
02 Mar 2026, 10:00 am - 11:00 am
Bochum building MPI-SP, room MB/1-84/90
CIS@MPG Colloquium
AI is reshaping software development, yet this rapid adoption risks introducing a new generation of security debt. In this talk, I will present my research program aimed at transforming AI from a source of vulnerabilities to a security enabler. I will begin by introducing a systematic framework for quantifying AI-induced cybersecurity risks through two benchmarks: CyberGym, which evaluates AI agents’ offensive capabilities in vulnerability reproduction and discovery, and BaxBench, which measures LLMs’ propensity to introduce security flaws when generating code. ...
AI is reshaping software development, yet this rapid adoption risks introducing a new generation of security debt. In this talk, I will present my research program aimed at transforming AI from a source of vulnerabilities to a security enabler. I will begin by introducing a systematic framework for quantifying AI-induced cybersecurity risks through two benchmarks: CyberGym, which evaluates AI agents’ offensive capabilities in vulnerability reproduction and discovery, and BaxBench, which measures LLMs’ propensity to introduce security flaws when generating code. Building on these findings, I will present a secure-by-design approach for AI-generated code. This includes security-centric fine-tuning that embeds secure coding practices directly into models, as well as a decoding-time constraining mechanism based on type systems to enforce safety guarantees. Finally, I will conclude by discussing my future research on building broader security and trust in AI-driven software ecosystems.
Read more

Designing for Society: AI in Networks, Markets, and Platforms

Ana-Andreea Stoic Max Planck Institute for Intelligent Systems , Tübingen
(hosted by Manuel Gomez Rodriguez)
26 Feb 2026, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
CIS@MPG Colloquium
AI systems increasingly reshape our networks, markets, and platforms. When deployed in social environments— online platforms, labor markets, and information ecosystems—AI interacts with complex human behavior, strategic incentives, and structural inequality. This talk focuses on foundational challenges and opportunities for AI systems: how to design and evaluate algorithmic interventions in complex social environments. I will present recent work on causal inference under competing treatments, which formalizes how competition for user attention and strategic behavior among experimenters distort experimental data and invalidate naïve estimates of algorithmic impact. ...
AI systems increasingly reshape our networks, markets, and platforms. When deployed in social environments— online platforms, labor markets, and information ecosystems—AI interacts with complex human behavior, strategic incentives, and structural inequality. This talk focuses on foundational challenges and opportunities for AI systems: how to design and evaluate algorithmic interventions in complex social environments. I will present recent work on causal inference under competing treatments, which formalizes how competition for user attention and strategic behavior among experimenters distort experimental data and invalidate naïve estimates of algorithmic impact. By modeling experimentation as a strategic data acquisition problem, we show how evaluation itself becomes an optimization problem, and we derive mechanisms that recover meaningful estimates despite interference and competition. I connect this problem to deriving foundational properties of systems that enable reliable experimentation. Beyond this case study, the talk highlights broader implications for the design and evaluation of AI systems in networks, markets, and platforms. I argue that responsible deployment requires rethinking evaluation methodologies to account for incentives, feedback loops, and system-level effects, and I outline directions for how algorithmic and statistical tools can tackle these challenges.
Read more

Foundations for Applied Cryptography: Post-quantum Authentication in Modern Applications

Michael Reichle ETH Zurich
(hosted by Catalin Hritcu)
24 Feb 2026, 10:00 am - 11:00 am
Bochum building MPI-SP, room MB/1-84/90
CIS@MPG Colloquium
Modern cryptography must navigate the transition to post-quantum security while meeting the demands of real-world systems. Due to the tension between post-quantum security and performance, this transition entails complex challenges on several fronts. In this talk, I will discuss the intricacies of provable security and how to establish solid theory for cryptographic building blocks that underpin our digital society. First, I will briefly cover my past work on searchable encryption and zero-knowledge proofs. In the main part of the talk, ...
Modern cryptography must navigate the transition to post-quantum security while meeting the demands of real-world systems. Due to the tension between post-quantum security and performance, this transition entails complex challenges on several fronts. In this talk, I will discuss the intricacies of provable security and how to establish solid theory for cryptographic building blocks that underpin our digital society. First, I will briefly cover my past work on searchable encryption and zero-knowledge proofs. In the main part of the talk, I will highlight my research on tools for advanced authentication in a post-quantum world. In particular, I will present my recent works on threshold signatures and blind signatures and discuss how to achieve stronger security guarantees under weak assumptions. In the end, I will highlight some open problems that are urgent to solve.
Read more

Uncovering the Mechanics of Vision AI Model Failures: Textures and Beyond

Blaine Hoak University of Wisconsin-Madison
(hosted by Christof Paar)
16 Feb 2026, 10:00 am - 11:00 am
Bochum building MPI-SP, room MB/1-84/90
CIS@MPG Colloquium
Artificial Intelligence (AI) models now serve as core components to a range of mature applications but remain vulnerable to a wide spectrum of attacks. Yet, the research community has yet to develop systematic understanding of model vulnerability. In this talk, I approach uncovering the mechanics of model failure from two complementary perspectives: the design of attack techniques and the features models exploit. First, I introduce The Space of Adversarial Strategies, a robustness evaluation framework constructed through a decomposition and reformulation of current attacks. ...
Artificial Intelligence (AI) models now serve as core components to a range of mature applications but remain vulnerable to a wide spectrum of attacks. Yet, the research community has yet to develop systematic understanding of model vulnerability. In this talk, I approach uncovering the mechanics of model failure from two complementary perspectives: the design of attack techniques and the features models exploit. First, I introduce The Space of Adversarial Strategies, a robustness evaluation framework constructed through a decomposition and reformulation of current attacks. With this, I isolate the components that drive attack success and provide insights for future defenses. Motivated by the widespread failure observed, I then turn to the feature space, where I uncover differences in visual processing and the human visual system that explain failures in AI systems. My work reveals that textures, or repeated patterns, are a core mechanism for driving model generalization, yet are also a primary source of vulnerability. I present new methodologies to quantify a model’s bias toward texture, uncover learned associations between textures and objects, and identify textures in images. With this, I find that up to 90% of failures can be explained by mismatches in texture information, highlighting texture as an important, yet overlooked, influence in model robustness. I conclude by outlining future work for addressing trustworthiness issues in both classification and generative settings, with particular attention to (mis)alignment between biological and artificial intelligence.
Read more

Building Private, Secure and Transparent Digital Identity at Scale

Harjasleen Malvai University of Illinois Urbana–Champaign
(hosted by Peter Schwabe)
12 Feb 2026, 10:00 am - 11:00 am
Bochum building MPI-SP, room .
CIS@MPG Colloquium
Digital identity controls access to many everyday essentials, from getting paid to accessing banking and benefits, and sits in the critical path of modern security. Even end-to-end encrypted messaging depends on authentic cryptographic identity, i.e., a trustworthy way to learn the cryptographic keys needed to encrypt messages to the right person. In practice, centralized identity providers become both a single point of failure and a single point of control: they decide what assertions are supported, and their compromise or coercion enables targeted attacks that are hard to detect. ...
Digital identity controls access to many everyday essentials, from getting paid to accessing banking and benefits, and sits in the critical path of modern security. Even end-to-end encrypted messaging depends on authentic cryptographic identity, i.e., a trustworthy way to learn the cryptographic keys needed to encrypt messages to the right person. In practice, centralized identity providers become both a single point of failure and a single point of control: they decide what assertions are supported, and their compromise or coercion enables targeted attacks that are hard to detect. Yet, many strong proposals assume new ecosystems or significant user effort – assumptions that don’t hold for the systems and users we have today.

My thesis is that it is possible to make identity infrastructure more private, secure, and transparent at scale while designing for existing user and ecosystem constraints. I’ll present two case studies across the identity lifecycle.

First, I’ll discuss key transparency for end-to-end encrypted messaging: how to make centralized key directories auditable so that even a compromised server cannot quietly swap keys for targeted users. I’ll show how this line of work evolved from formal privacy and history guarantees (SEEMless) to a billion-user architecture (Parakeet) built for real operational constraints (such as distributed storage and long time horizons), which now underpins the key transparency deployments in WhatsApp and Facebook Messenger.

Second, I’ll briefly describe credential bootstrapping with accountability (CanDID): a path to establishing privacy-preserving credentials from existing web sources without assuming a fully mature verifiable-credential ecosystem, while supporting practical requirements like revocation, recovery, and compliance checks.

I’ll close by highlighting ongoing work and open problems motivated by these systems and sketch a research agenda for building auditable and privacy-preserving infrastructure at internet scale, for identity and beyond.
Read more

Beyond Static Alignment: Advancing Trustworthy and Socially Intelligent AI Assistant

Jieyu Zhao University of Southern California
(hosted by Abhilasha Ravichander, Asia Biega)
11 Feb 2026, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Large language models have transformed how we interact with technology, but most deployed systems remain reactive and rely on static, one-size-fits-all alignment, limiting trust in real-world, high-stakes settings. This talk explores a path toward personalized, trustworthy AI assistants that can reason, continually adapt, and align with user values while remaining safe and socially appropriate. I will introduce Computer-Using Agents that combine GUI operations and code generation to efficiently complete real-world tasks, and present CoAct-1, a multi-agent system that coordinates planning and execution. ...
Large language models have transformed how we interact with technology, but most deployed systems remain reactive and rely on static, one-size-fits-all alignment, limiting trust in real-world, high-stakes settings. This talk explores a path toward personalized, trustworthy AI assistants that can reason, continually adapt, and align with user values while remaining safe and socially appropriate. I will introduce Computer-Using Agents that combine GUI operations and code generation to efficiently complete real-world tasks, and present CoAct-1, a multi-agent system that coordinates planning and execution. I will then discuss SEA, a black-box auditing algorithm for uncovering LLM knowledge deficiencies and probing failure modes such as hallucination under limited query budgets. Next, I will present WildFeedback, a framework that learns in-situ user preferences from natural, multi-turn interactions, enabling continual personalization beyond lab-style preference data. Finally, I will highlight ongoing work on proactive social intelligence and culturally grounded evaluation, spanning intention understanding, reasoning consistency, and value-aligned collaboration. Together, these advances move us closer to AI systems that don’t just respond, but adapt responsibly and assist people in ways that are reliable, equitable, and context-aware.
Read more

Towards More Trustworthy and Efficient Systems

Hugo Lefeuvre University of British Columbia
(hosted by Gilles Barthe, Deepak Garg)
10 Feb 2026, 10:00 am - 11:00 am
Bochum building MPI-SP
CIS@MPG Colloquium
Software usages grow much faster than computing hardware. Paradoxically, modern software systems make inefficient use of computing resources: decades of feature creep have made them too generic to perform well on any specific task. These decades of growth have also made systems fragile -- and frankly insecure, glued together from countless components of diverse origins, critical or confidential, buggy, risky, AI-generated, or otherwise untrustworthy. This talk will take the audience on a journey at the intersection of systems and security. ...
Software usages grow much faster than computing hardware. Paradoxically, modern software systems make inefficient use of computing resources: decades of feature creep have made them too generic to perform well on any specific task. These decades of growth have also made systems fragile -- and frankly insecure, glued together from countless components of diverse origins, critical or confidential, buggy, risky, AI-generated, or otherwise untrustworthy. This talk will take the audience on a journey at the intersection of systems and security. I will give an overview of my past and present works applying isolation and specialization techniques to make systems more more trustworthy and more efficient (Unikraft, FlexOS, CHERIoT), demonstrating the shortcomings of these techniques and how to address them (CIVs, SoK), and getting these advances deployed to better the real world. I will conclude with a forward-looking perspective on my research and impact plans towards achieving this vision of more robust and efficient software systems.
Read more

Modern Challenges in Learning Theory

Nataly Brukhim Institute for Advanced Study and the Center for Discrete Mathematics and Theoretical Computer Science
(hosted by Derek Dreyer)
09 Feb 2026, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 029
CIS@MPG Colloquium
Machine learning relies on its ability to generalize from limited data, yet a principled theoretical understanding of generalization remains incomplete. While binary classification is well understood in the classical PAC framework, even its natural extension to multiclass learning is substantially more challenging. In this talk, I will present recent progress in multiclass learning that characterizes when generalization is possible and how much data is required, resolving a long-standing open problem on extending the Vapnik–Chervonenkis (VC) dimension beyond the binary setting. ...
Machine learning relies on its ability to generalize from limited data, yet a principled theoretical understanding of generalization remains incomplete. While binary classification is well understood in the classical PAC framework, even its natural extension to multiclass learning is substantially more challenging. In this talk, I will present recent progress in multiclass learning that characterizes when generalization is possible and how much data is required, resolving a long-standing open problem on extending the Vapnik–Chervonenkis (VC) dimension beyond the binary setting. I will then turn to complementary results on efficient learning via boosting. We extend boosting theory to multiclass classification, while maintaining computational and statistical efficiency even for unbounded label spaces. Lastly, I will discuss generalization in sequential learning settings, where a learner interacts with an environment over time. We introduce a new framework that subsumes classically studied settings (bandits and statistical queries) together with a combinatorial parameter that bounds the number of interactions required for learning.
Read more

Symmetry in Neural Network Parameter Spaces

Bo Zhao University of California, San Diego
(hosted by Bernt Schiele)
05 Feb 2026, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
In many neural networks, different parameter values can yield the same loss, often due to underlying symmetries in the parameter space. We introduce a general framework for continuous symmetries based on equivariance in activation functions, revealing a new set of nonlinear, data-dependent symmetries. Using these symmetries, we derive topological properties of the minima and identify conserved quantities in gradient flows. As a practical application of parameter space symmetries, we present an algorithm called symmetry teleportation, which leverages parameter symmetries to search loss level sets for points with desired properties. ...
In many neural networks, different parameter values can yield the same loss, often due to underlying symmetries in the parameter space. We introduce a general framework for continuous symmetries based on equivariance in activation functions, revealing a new set of nonlinear, data-dependent symmetries. Using these symmetries, we derive topological properties of the minima and identify conserved quantities in gradient flows. As a practical application of parameter space symmetries, we present an algorithm called symmetry teleportation, which leverages parameter symmetries to search loss level sets for points with desired properties. This approach leads to improvements in both convergence and generalization. We also discuss future work on foundational understanding of deep learning that reveals, characterizes, and leverages hidden structures to shape optimization and model behavior, enabling more efficient and interpretable AI systems.
Read more

The Skolem Problem: a century-old enigma at the heart of computation

Joël Ouaknine Max Planck Institute for Software Systems
04 Feb 2026, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
It has been described as the most important problem whose decidability is still open: the Skolem Problem asks how to determine algorithmically whether a given integer linear recurrence sequence (such as the Fibonacci numbers) has a zero term. This deceptively simple question arises across a wide range of topics in computer science and mathematics, from program verification and automata theory to number theory and logic. This talk traces the history of the Skolem Problem: from the early 1930s to the current frontier of one of the most enduring open questions in computer science.
It has been described as the most important problem whose decidability is still open: the Skolem Problem asks how to determine algorithmically whether a given integer linear recurrence sequence (such as the Fibonacci numbers) has a zero term. This deceptively simple question arises across a wide range of topics in computer science and mathematics, from program verification and automata theory to number theory and logic. This talk traces the history of the Skolem Problem: from the early 1930s to the current frontier of one of the most enduring open questions in computer science.

Toward Practical and Scalable Systems Evaluation for Post-Moore Datacenters

Hejing Li Max Planck Institute for Software Systems
02 Feb 2026, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Thesis Proposal
Having a solid system evaluation under realistic workloads and environments is essential for datacenter network research. Modern datacenter systems are shaped by a wide range of factors - including hardware behavior, software components, and complex interactions between them - whose combined effects on end-to-end performance is often difficult to predict. However, evaluating such systems on physical testbeds is frequently infeasible due to scale, cost, limited experimental control, and the increasing reliance on specialized hardware that may be unavailable or still under development. ...
Having a solid system evaluation under realistic workloads and environments is essential for datacenter network research. Modern datacenter systems are shaped by a wide range of factors - including hardware behavior, software components, and complex interactions between them - whose combined effects on end-to-end performance is often difficult to predict. However, evaluating such systems on physical testbeds is frequently infeasible due to scale, cost, limited experimental control, and the increasing reliance on specialized hardware that may be unavailable or still under development. As a result, researchers often turn to simulation. Existing simulators, however, typically focus on isolated components, such as network protocols, host architectures, or hardware RTL, making it challenging to conduct faithful end-to-end evaluations or to scale experiments to realistic datacenter sizes. The goal of this thesis is to provide researchers with practical and scalable tools and methodologies for conducting faithful end-to-end system evaluation targeting modern datacenters. To this end, this thesis is structured around three components. First, it introduced SimBricks, an end-to-end simulation framework that enables the modular composition of best-of-breed simulators, allowing unmodified hardware and software system implementations to be evaluated together within a single virtual testbed. Second, it presented SplitSim, a simulation framework designed to make large-scale end-to-end evaluation practical by supporting mixed-.delity simulation, controlled decomposition, and efficient resource utilization. Finally, the thesis will include a set of case studies that apply SplitSim to the evaluation of large-scale networked systems, demonstrating a concrete evaluation workflow and distilling lessons on navigating trade-offs between fidelity, scalability, and simulation cost.
Read more

Towards a Privacy-First Future for Wearable AI

Shwetha Rajaram University of Michigan
(hosted by Carmela Troncoso)
02 Feb 2026, 10:00 am - 11:00 am
Bochum building MPI-SP, room MB/1-84/90
CIS@MPG Colloquium
Wearable AI systems, such as display-free smartglasses and augmented reality (AR) devices, are emerging as a new paradigm of personal computing. However, they introduce novel privacy harms for both users and bystanders, relying on multimodal data that reveals people's identities, activities, and surroundings. In this talk, I demonstrate that by building wearable AI systems for adaptation – enabling users and systems to dynamically adjust interactions across contexts – we can mitigate privacy risks while supporting meaningful use. ...
Wearable AI systems, such as display-free smartglasses and augmented reality (AR) devices, are emerging as a new paradigm of personal computing. However, they introduce novel privacy harms for both users and bystanders, relying on multimodal data that reveals people's identities, activities, and surroundings. In this talk, I demonstrate that by building wearable AI systems for adaptation – enabling users and systems to dynamically adjust interactions across contexts – we can mitigate privacy risks while supporting meaningful use. First, through co-design studies with AR and security & privacy researchers, I establish a design space of alternatives to traditional interactions that reduce data exposure while maintaining functionality. To embed this privacy mindset in designers’ workflows, I develop interactive tools for critically analyzing risks in their own applications. Finally, I investigate how to automatically balance user experience and privacy for wearable AI users and bystanders, proposing an optimization framework for system-driven "negotiations" of sensing capabilities. I conclude by discussing open challenges and future research towards this vision of privacy-adaptive wearable AI, from technical mechanisms to implications for policy and regulation.
Read more

Pushing the Boundaries in Stateless Model Checking

Iason Marmanis Max Planck Institute for Software Systems
28 Jan 2026, 3:30 pm - 4:30 pm
Kaiserslautern building G26, room 111
SWS Student Defense Talks - Thesis Defense
Stateless model checking (SMC) verifies a concurrent program by systematically exploring its state space. To combat the state-space explosion problem, SMC is frequently combined with Dynamic Partial Order Reduction (DPOR), a technique that avoids exploring executions that are deemed equivalent to one another. Still, DPOR’s scalability is limited by the size of the input program.

This thesis improves scalability by (i) providing direct support for common coding patterns that would otherwise have to be handled inefficiently, ...
Stateless model checking (SMC) verifies a concurrent program by systematically exploring its state space. To combat the state-space explosion problem, SMC is frequently combined with Dynamic Partial Order Reduction (DPOR), a technique that avoids exploring executions that are deemed equivalent to one another. Still, DPOR’s scalability is limited by the size of the input program.

This thesis improves scalability by (i) providing direct support for common coding patterns that would otherwise have to be handled inefficiently, and (ii) combining DPOR with other state-space reduction techniques. Key to our contributions is a DPOR framework that generalizes an existing state-of-the-art algorithm.
Read more

Building Robotics Foundation Models with Reasoning in the Loop

Jiafei Duan University of Washington
(hosted by Christian Theobalt)
28 Jan 2026, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 029
CIS@MPG Colloquium
Recent advances in generative AI have demonstrated the power of scaling: large language and vision models trained on internet-scale data now exhibit remarkable capabilities in perception, generation, and reasoning. These successes have inspired growing interest in bringing foundation-model paradigms to robotics, with the goal of moving beyond task-specific autonomy in constrained environments toward general-purpose robots that can operate robustly in open-world settings. However, robotics fundamentally differs from language and vision. Robot learning cannot rely on passive internet data at scale, ...
Recent advances in generative AI have demonstrated the power of scaling: large language and vision models trained on internet-scale data now exhibit remarkable capabilities in perception, generation, and reasoning. These successes have inspired growing interest in bringing foundation-model paradigms to robotics, with the goal of moving beyond task-specific autonomy in constrained environments toward general-purpose robots that can operate robustly in open-world settings. However, robotics fundamentally differs from language and vision. Robot learning cannot rely on passive internet data at scale, and collecting large-scale, high-quality embodied interaction data remains expensive and slow. As a result, simply scaling data and model parameters is insufficient. To build general-purpose and robust robotics foundation models, we must instead ask: how can robots learn more from less data—and continue to improve over time? In this talk, I argue that reasoning in the loop offers a promising path forward. Rather than treating reasoning as a downstream capability applied after learning, I show how reasoning can be integrated directly into the learning process itself. This enables robots to learn from structured feedback, temporal context, and failure, thereby compensating for data scarcity and improving generalization. I will present a unified research agenda along three axes. First, I introduce approaches for spatial reasoning, enabling robots to ground language in 3D space and reason about object relationships for precise manipulation. Second, I discuss temporal reasoning, focusing on memory-centric models that retain, query, and reason over past observations to support long-horizon, high-precision control. Third, I show how reasoning over failures allows robots to understand why actions fail and use that understanding to self-improve, increasing robustness without additional supervision. Together, these results reframe robotics foundation models as systems that learn through reasoning, closing the loop between perception, action, and structured inference to enable self-improving autonomy.
Read more

On Predictive Accuracy and Fairness in Human-AI Teams

Nastaran Okati Max Planck Institute for Software Systems
26 Jan 2026, 1:30 pm - 2:30 pm
Kaiserslautern building G26, room 111
SWS Student Defense Talks - Thesis Defense
Human-AI teams are increasingly deployed across high-stakes domains such as medical diagnosis, content moderation, and recruitment. These teams strive to leverage the strengths of both human decision-makers and AI models, while mitigating their respective weaknesses. For instance, such hybrid systems can combine the efficiency and quality of AI-generated decisions with human experience and domain knowledge. Moreover, they can be free of statistical biases in AI models and cognitive biases inherent to human decision-making. The combined system's performance can hence surpass that of the human or the AI model in isolation, ...
Human-AI teams are increasingly deployed across high-stakes domains such as medical diagnosis, content moderation, and recruitment. These teams strive to leverage the strengths of both human decision-makers and AI models, while mitigating their respective weaknesses. For instance, such hybrid systems can combine the efficiency and quality of AI-generated decisions with human experience and domain knowledge. Moreover, they can be free of statistical biases in AI models and cognitive biases inherent to human decision-making. The combined system's performance can hence surpass that of the human or the AI model in isolation, achieving human-AI complementarity.

Despite the prevalence of such hybrid teams, most existing approaches remain heuristic-driven, ignore human users and their error models, and hence fail to optimize the human-AI team performance. My thesis strives to close these gaps and fully unlock this potential for complementarity by (i) optimizing for human-AI collaboration, rather than model-centric performance—ensuring these models best support and complement human decision-makers who utilize them—and (ii) designing efficient post-processing algorithms that ensure fairness of high-stakes decisions made by these teams—thereby supporting people who are affected by such decisions.
Read more

Where are all the fixed points?

Benjamin Kaminski Fachrichtung Informatik - Saarbrücken
14 Jan 2026, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Fixed points are a recurring theme in computer science and related fields: shortest paths, game equilibria, semantics and verification, social choice theory, or dynamical systems are only a few of many instances. Various fixed point theorems - e.g. the famous Kleene fixed point theorem - state that fixed points emerge as limits of suitably seeded fixed point iterations. 

I will showcase in this talk a purely algebraic way of reasoning about fixed points which we call the algebra of iterative constructions (AIC). ...
Fixed points are a recurring theme in computer science and related fields: shortest paths, game equilibria, semantics and verification, social choice theory, or dynamical systems are only a few of many instances. Various fixed point theorems - e.g. the famous Kleene fixed point theorem - state that fixed points emerge as limits of suitably seeded fixed point iterations. 

I will showcase in this talk a purely algebraic way of reasoning about fixed points which we call the algebra of iterative constructions (AIC). It lead us to discover novel fixed point theorems as well as prove fully algebraically well-established ones (e.g. Kleene, Tarksi-Kantorovic, and k-induction). As for the novel fixed point theorems, we will (given a suitable setting) obtain a method that maps any point to two canonical corresponding fixed points of a function by way of a limit of some fixed point iteration.

Our algebra is mechanized in Isabelle/HOL. Isabelle's sledgehammer tool is able to find proofs of fixed point theorems fully automatically whereas sledgehammer is not able to find such proofs relying only on Isabelle’s standard libraries.

From the audience I would like to learn: Do you encounter fixed points or fixed point problems in your work? Do you perhaps work on algorithms for solving fixed point problems?

I look forward to talking to you!
Read more