Recent events – MPI SWS

Boosting — Empowering Citizens with Behavioral Science

Ralph Hertwig Max Planck Institute for Human Development
(hosted by Krishna Gummadi)

26 Nov 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

Behavioral public policy came to the fore with the introduction of nudging, which aims to steer behavior while maintaining freedom of choice. Responding to critiques of nudging (e.g., that it does not promote agency and relies on benevolent choice architects), other behavioral policy approaches focus on empowering citizens. Here we review boosting, a behavioral policy approach that aims to foster people's agency, self-control, and ability to make informed decisions. It is grounded in evidence from behavioral science showing that human decision making is not as notoriously flawed as the nudging approach assumes. ...

Behavioral public policy came to the fore with the introduction of nudging, which aims to steer behavior while maintaining freedom of choice. Responding to critiques of nudging (e.g., that it does not promote agency and relies on benevolent choice architects), other behavioral policy approaches focus on empowering citizens. Here we review boosting, a behavioral policy approach that aims to foster people's agency, self-control, and ability to make informed decisions. It is grounded in evidence from behavioral science showing that human decision making is not as notoriously flawed as the nudging approach assumes. We argue that addressing the challenges of our time—such as climate change, pandemics, and the threats to liberal democracies and human autonomy posed by digital technologies and choice architectures—calls for fostering capable and engaged citizens as a first line of response to complement slower, systemic approaches. Boosts can be delivered through different means, one being digital tools — the talk will give a few illustrative examples.

Curriculum Design for Reinforcement Learning Agents

Georgios Tzannetos Max Planck Institute for Software Systems

24 Nov 2025, 2:30 pm - 3:30 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

Reinforcement learning (RL) enables agents to learn complex behaviours and excel in various domains such as robotics, gaming, and large language models (LLMs). Despite these successes, RL algorithms remain inefficient, rendering the training process challenging and limiting their broader application in real-world settings. Motivated by the importance of curricula in pedagogical domains, there is a growing interest in leveraging curriculum strategies when training agents in challenging environments. However, existing methods for automatic curriculum design typically require domain-specific hyperparameter tuning, ...

Reinforcement learning (RL) enables agents to learn complex behaviours and excel in various domains such as robotics, gaming, and large language models (LLMs). Despite these successes, RL algorithms remain inefficient, rendering the training process challenging and limiting their broader application in real-world settings. Motivated by the importance of curricula in pedagogical domains, there is a growing interest in leveraging curriculum strategies when training agents in challenging environments. However, existing methods for automatic curriculum design typically require domain-specific hyperparameter tuning, rely on expensive optimization procedures, or have limited theoretical underpinnings. To address these limitations, we design different curriculum strategies grounded in the pedagogical concept of Zone of Proximal Development. The theoretical and empirical analysis across multiple domains affirms the effectiveness of our strategies. In particular, our strategies are shown to improve the training efficiency of agents under different learning objectives, including uniform performance, target performance, and constrained performance. Finally, addressing a real-world LLM deployment scenario, we show how our curriculum strategy improves the inference-time efficiency of LLMs by compressing models’ chain-of-thought reasoning process.

How to Manage a Hotel Desk? Stable Perfect Hashing in the Incremental Setting

Guy Even MPI-INF - D1

05 Nov 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Many modern applications—from large-scale databases to network routers and genome repositories—depend on maintaining large dynamic sets of elements. Efficient management of these sets requires data structures that can quickly support insertions and deletions, answer queries such as "Is this element in the set?" or "What is the value associated with this element?", and assign distinct short keys to elements as the set grows.

The field of data structures is concerned with specifying functionality, abstracting computational models, ...

Many modern applications—from large-scale databases to network routers and genome repositories—depend on maintaining large dynamic sets of elements. Efficient management of these sets requires data structures that can quickly support insertions and deletions, answer queries such as "Is this element in the set?" or "What is the value associated with this element?", and assign distinct short keys to elements as the set grows.

The field of data structures is concerned with specifying functionality, abstracting computational models, designing efficient representations, and analyzing the running time and memory requirements of algorithms over these representations. Classical data structures developed for representing sets include dictionaries, retrieval data structures, filters, and perfect hashing.

In this talk, I will explore these issues through the lens of perfect hashing, a method for assigning each element a distinct identifier, or hashcode, with no collisions. We will focus on how to simultaneously satisfy several competing design goals:

Small space: using near-optimal memory proportional to the set’s size.

Fast operations: supporting constant-time insertions, deletions, and queries.

Low redundancy: keeping the range of hashcodes close to the set’s size.

Stability: ensuring that each element’s hashcode remains unchanged while it stays in the set.

Extendability: adapting automatically to unknown or growing data sizes.

This talk is based on joint work with Ioana Bercea.

Accountable Multi-Agent Sequential Decision Making

Stelios Triantafyllou Max Planck Institute for Software Systems

30 Oct 2025, 11:00 am - 12:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

As AI agents increasingly engage in high-stakes decision making, it is essential to assess their accountability in ways that are both fair and interpretable. This involves explaining expected or realized outcomes of multi-agent systems and attributing responsibility for those outcomes to the participating agents. Addressing these challenges is key to fostering societal trust and easing the adoption of AI decision makers. This thesis investigates accountability in multi-agent sequential decision making. We develop methods to attribute responsibility for observed outcomes and overall system performance, ...

As AI agents increasingly engage in high-stakes decision making, it is essential to assess their accountability in ways that are both fair and interpretable. This involves explaining expected or realized outcomes of multi-agent systems and attributing responsibility for those outcomes to the participating agents. Addressing these challenges is key to fostering societal trust and easing the adoption of AI decision makers. This thesis investigates accountability in multi-agent sequential decision making. We develop methods to attribute responsibility for observed outcomes and overall system performance, design efficient approximation algorithms for otherwise intractable attribution problems, and introduce causal tools to explain how agents’ decisions influence outcomes. Together, these contributions establish theoretical foundations and practical tools for accountable decision making, drawing on and integrating insights from causality, multi-agent reinforcement learning and game theory.

A Logical Foundation For Multi-Language Interoperability

Brigitte Pientka McGill University
(hosted by Derek Dreyer)

28 Oct 2025, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029

SWS Colloquium

Today’s software systems are complex and often made up of parts written in different programming languages with different computational and memory management strategies. This allows programmers to combine different languages and choose the most suitable one for a given problem. It also allows the gradual migration of existing projects from one language to another, or to reuse existing source code.

While this flexibility offers clear advantages, it also introduces significant challenges, as different programming languages may have fundamentally different implementations and may use different runtime environments, ...

Today’s software systems are complex and often made up of parts written in different programming languages with different computational and memory management strategies. This allows programmers to combine different languages and choose the most suitable one for a given problem. It also allows the gradual migration of existing projects from one language to another, or to reuse existing source code.

While this flexibility offers clear advantages, it also introduces significant challenges, as different programming languages may have fundamentally different implementations and may use different runtime environments, which are hard to combine. As a consequence composing parts written in different languages often results in complex interfaces between languages, insufficient flexibility, poor performance, and hard to diagnose errors. This lack of interoperability support can lead to subtle bugs and security vulnerabilities, especially in large or long-lived systems.

Existing foundations for interoperability often assume that we compile languages into a common low-level language. This is however not always realistic. We propose a logical foundation for interoperability where we retain the static and operational semantics of each part. It is grounded in adjoint logic -- a logic that unifies a wide collection of logics through the up-shift and down-shift modalities. We give a Curry-Howard interpretation of this logic where we use the down-shift modality to model foreign function calls, and the up-shift modality to model runtime code generation and execution. Our system is parametric to a user-defined collection of languages and their accessibility relation, which controls how languages interact with each other, allowing the interoperability of various formations of languages. We sketch the statics and an operational semantics together with properties such as accessibility safety, which ensures that languages respect their user-defined boundaries, alongside type safety. Finally, it time permits, we outline how we have used this foundation to reason about the interoperability between languages with fundamentally different runtime implementations such as the interoperability between a quantum and a purely functional language.

Counterfactual Reasoning and Uncertainty Quantification for AI-Assisted Decision Making

Nina Corvelo Benz Max Planck Institute for Software Systems

15 Oct 2025, 4:00 pm - 5:00 pm

Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

Artificial intelligence (AI) systems are increasingly being used to support human experts in various domains such as healthcare, education, and the judicial system. The aim of these systems is complementarity—leveraging the strengths of each side, human and AI, to compensate for the weaknesses of the other. In most such systems, the human expert makes decisions based on a prediction by the AI model and their own judgment. However, models designed for automated decision making are typically trained in isolation and do not take into account the human decision maker when making predictions. ...

Artificial intelligence (AI) systems are increasingly being used to support human experts in various domains such as healthcare, education, and the judicial system. The aim of these systems is complementarity—leveraging the strengths of each side, human and AI, to compensate for the weaknesses of the other. In most such systems, the human expert makes decisions based on a prediction by the AI model and their own judgment. However, models designed for automated decision making are typically trained in isolation and do not take into account the human decision maker when making predictions. As a result, when these AI models are used in decision support systems, their predictions may not be helpful, undermining the human expert’s trust in the AI model and leading to no improvement in their decisions. To address this, this thesis focuses on the design of AI-based decision support systems that leverage the interaction with the expert through counterfactual reasoning and uncertainty quantification. It proposes decision support systems for three distinct decision-making contexts, where each one is based on a novel methodological approach and is evaluated with experiments using real-world data or a human subject study.

Can machine learning revolutionize biomarker discovery?

Karsten Borgwardt Max-Planck-Institut für Biochemie
(hosted by Manuel Gomez Rodriguez)

15 Oct 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

Machine learning has transformed many areas of science and technology, including the life sciences, most prominently through its breakthrough impact on protein structure prediction, recognized by the 2024 Nobel Prize in Chemistry. An open question, however, is whether machine learning can have a similarly profound impact on biomarker discovery, that is, the identification of biological properties that predict system functions or phenotypes. Biomarker discovery is a key topic for advancing biology and medicine. In this talk, ...

Machine learning has transformed many areas of science and technology, including the life sciences, most prominently through its breakthrough impact on protein structure prediction, recognized by the 2024 Nobel Prize in Chemistry. An open question, however, is whether machine learning can have a similarly profound impact on biomarker discovery, that is, the identification of biological properties that predict system functions or phenotypes. Biomarker discovery is a key topic for advancing biology and medicine. In this talk, I will present our efforts to harness machine learning for biomarker discovery, summarize our algorithmic contributions, and discuss the opportunities and challenges in this field.

From Exploits to Defenses: Building Trustworthy Digital Systems

Thorsten Holz Max Planck Institute for Security and Privacy
(hosted by Krishna Gummadi)

17 Sep 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

Building trustworthy software systems has become increasingly challenging as complexity grows across the hardware-software stack. Adversaries exploit sophisticated techniques such as return-oriented programming and timing side channels to bypass traditional defenses and compromise critical components. This talk examines these classes of low-level attacks and presents defenses we have developed, including control-flow integrity mechanisms and memory tagging. I will further discuss how automated approaches such as fuzzing can help us to systematically expose latent vulnerabilities and strengthen the design of security-critical systems, ...

Building trustworthy software systems has become increasingly challenging as complexity grows across the hardware-software stack. Adversaries exploit sophisticated techniques such as return-oriented programming and timing side channels to bypass traditional defenses and compromise critical components. This talk examines these classes of low-level attacks and presents defenses we have developed, including control-flow integrity mechanisms and memory tagging. I will further discuss how automated approaches such as fuzzing can help us to systematically expose latent vulnerabilities and strengthen the design of security-critical systems, aiming for resilience against both current and emerging threats. I will conclude with an overview of future challenges.

The fine-grained complexity of NFA intersection emptiness

Neha Rino University of Warwick

12 Sep 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

SWS Colloquium

Given some integer k, intersection emptiness of k Nondeterministic Finite Automata (NFA k-IE) is a fundamental problem in automata theory, with applications across Computer science from Arithmetic theories and model checking to graph database queries. In this talk, I will discuss some results regarding the fine-grained complexity of NFA k-IE. Informally, what I mean by fine-grained complexity is that we want to know (upto logarithmic factors) the runtime complexity of our algorithms, and argue why they cannot be improved by relating it to the state of the art in solving long-standing hard problems like SAT. ...

Given some integer k, intersection emptiness of k Nondeterministic Finite Automata (NFA k-IE) is a fundamental problem in automata theory, with applications across Computer science from Arithmetic theories and model checking to graph database queries. In this talk, I will discuss some results regarding the fine-grained complexity of NFA k-IE. Informally, what I mean by fine-grained complexity is that we want to know (upto logarithmic factors) the runtime complexity of our algorithms, and argue why they cannot be improved by relating it to the state of the art in solving long-standing hard problems like SAT.

I will demonstrate (what we believe to be) a new algorithm for solving NFA k-IE. If all the NFAs have n states and m transitions, our algorithm runs in time O(n^{k-1}m), compared to the O(m^k) runtime of the classic Cartesian product approach. I will also present a matching lower bound subject to the Combinatorial k-Clique hypothesis, and a barrier to tight SETH-based lower bounds. This is joint work with Dmitry Chistikov, at the University of Warwick, UK.

Permissive Assumptions in Logical Controller Synthesis for Cyber-Physical Systems

Satya Prakash Nayak Max Planck Institute for Software Systems

09 Sep 2025, 1:00 pm - 2:00 pm

Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Proposal

The automatic construction of correct-by-design systems has emerged as a central challenge in cyber-physical systems (CPS). As CPS combine discrete logical decision-making with continuous physical processes, their correctness requires reasoning both about logical specifications and real-time physical behavior. A common approach is to abstract the physical dynamics into a discrete plant model and synthesize a logical controller for this plant that ensures the specification. Such approaches often rely on a set of assumptions about how the system interacts with its environment. ...

The automatic construction of correct-by-design systems has emerged as a central challenge in cyber-physical systems (CPS). As CPS combine discrete logical decision-making with continuous physical processes, their correctness requires reasoning both about logical specifications and real-time physical behavior. A common approach is to abstract the physical dynamics into a discrete plant model and synthesize a logical controller for this plant that ensures the specification. Such approaches often rely on a set of assumptions about how the system interacts with its environment. However, these assumptions are typically overly restrictive and fail to capture the full range of behaviors that the environment can exhibit, leading to conservative or inflexible controllers. This thesis addresses these limitations by proposing frameworks for synthesizing controllers under more permissive assumptions.

The first part of the thesis considers permissiveness in the interactions between multiple discrete logical components, where assumptions restrict the behavior of other components. We propose new methods to compute permissive assumptions that capture all cooperative behavior of other components. Building on this, we propose a negotiation-based approach to compute assume-guarantee contracts between components, allowing components to retain flexibility while ensuring correctness in a distributed setting.

The second part of the thesis addresses permissiveness in the interaction between high-level logic and low-level physical dynamics, where assumptions on the plant model restrict the behavior of the physical environment. We develop a new class of assumptions that capture richer behaviors of low-level controllers, enabling logical controllers to adapt seamlessly to changes by the external environment. To ensure scalability for large plant models, we propose a universal controller framework where controller decisions are conditioned on future branching-time properties of the plant model, learned from a small set of representative plant models. Finally, we introduce a robust semantics for branching-time temporal logics that allows reasoning under uncertainty or partial violations of assumptions without increasing computational complexity.

Algorithmic Problems for Linear Recurrence Sequences

Joris Nieuwveld Max Planck Institute for Software Systems

05 Sep 2025, 3:00 pm - 4:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Defense

Linear recurrence sequences (LRS) are among the most fundamental and easily definable classes of number sequences, encompassing many classical sequences such as polynomials, powers of two, and the Fibonacci numbers. They also describe the dynamics of iterated linear maps and arise naturally in numerous contexts within computer science, mathematics, and other quantitive sciences. However, despite their simplicity, many easy-to-state decision problems for LRS have stubbornly remained open for decades despite considerable and sustained attention. Chief among these are the Skolem problem and the Positivity problem, ...

Linear recurrence sequences (LRS) are among the most fundamental and easily definable classes of number sequences, encompassing many classical sequences such as polynomials, powers of two, and the Fibonacci numbers. They also describe the dynamics of iterated linear maps and arise naturally in numerous contexts within computer science, mathematics, and other quantitive sciences. However, despite their simplicity, many easy-to-state decision problems for LRS have stubbornly remained open for decades despite considerable and sustained attention. Chief among these are the Skolem problem and the Positivity problem, which ask to determine, for a given LRS, whether it contains a zero term and whether it contains only positive terms, respectively. For both problems, decidability is currently open, i.e., whether they are algorithmically solvable.

In this thesis, we present the following results. For the Skolem problem, we introduce an algorithm for simple LRS whose correctness is unconditional but whose termination relies on two classical, widely-believed number-theoretic conjectures. This algorithm is implementable in practice, and we report on experimental results. For the Positivity problem, we introduce the notion of reversible LRS, which enables us to carve out a large decidable class of sequences. We also examine various expansions of classical logics by predicates obtained from LRS. In particular, we study expansions of monadic second-order logic of the natural numbers with order and present major advances over the seminal results of Büchi, Elgot, and Rabin from the early 1960s. Finally, we investigate fragments of Presburger arithmetic, where, among others, we establish the decidability of the existential fragment of Presburger arithmetic expanded with powers of 2 and 3.

Supporting Human-Human Communication: Towards a Proactive AI Paradigm

Christian Danescu-Niculescu-Mizil Cornell University
(hosted by Krishna Gummadi)

05 Sep 2025, 1:30 pm - 2:30 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

Recent years have seen a gold rush towards replacing people with AI agents in communication: they can serve as your therapist, your tutor, your financial advisor, your interviewer. In this talk I will propose a contrasting vision: one where AI is used for supporting humans in their communication while preserving their agency. Achieving this vision requires moving beyond the current transactional paradigm embodied by current generative AI systems, which are designed to fulfill the immediate goals of a single person, ...

Recent years have seen a gold rush towards replacing people with AI agents in communication: they can serve as your therapist, your tutor, your financial advisor, your interviewer. In this talk I will propose a contrasting vision: one where AI is used for supporting humans in their communication while preserving their agency. Achieving this vision requires moving beyond the current transactional paradigm embodied by current generative AI systems, which are designed to fulfill the immediate goals of a single person, such as answering a question, solving a math problem, booking a flight, or (repeatedly) replying in character. To meaningfully support human-human communication without disrupting or supplanting it, an AI system must instead follow a proactive paradigm: it needs to decide when to intervene to offer support as the interaction unfolds, rather than wait to explicitly be prompted as AI agents and chatbots do today. In this talk I will present initial progress on AI technologies that enable such a proactive mode of operation, and demonstrate communication support tools that embody it.

Strategic and counterfactual reasoning in AI-assisted decision making

Efstratios Tsirtsis Max Planck Institute for Software Systems

19 Aug 2025, 2:30 pm - 3:30 pm

Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

From finance and healthcare to criminal justice and transportation, many domains that involve high-stakes decisions, traditionally made by humans, are increasingly integrating artificial intelligence (AI) systems into their decision making pipelines. While recent advances in machine learning and optimization have given rise to AI systems with unprecedented capabilities, fully automating such decisions is often undesirable. Instead, a promising direction lies in AI-assisted decision making, where AI informs or complements human decisions without completely removing human oversight. ...

From finance and healthcare to criminal justice and transportation, many domains that involve high-stakes decisions, traditionally made by humans, are increasingly integrating artificial intelligence (AI) systems into their decision making pipelines. While recent advances in machine learning and optimization have given rise to AI systems with unprecedented capabilities, fully automating such decisions is often undesirable. Instead, a promising direction lies in AI-assisted decision making, where AI informs or complements human decisions without completely removing human oversight. In this talk, I will present my PhD work on AI-assisted decision making in settings where humans rely on two core cognitive capabilities: strategic reasoning and counterfactual reasoning. First, I will introduce game-theoretic methods for supporting policy design in strategic environments, enabling a decision maker to allocate resources (e.g., loans) to individuals who adapt their behavior in response to transparency regarding the decision policy. Next, I will present methods to enhance a decision maker’s counterfactual reasoning process— identifying key past decisions (e.g., in clinical treatments) which, if changed, could have improved outcomes and, hence, serve as valuable learning signals. Finally, I will discuss a computational model of how people attribute responsibility between humans and AI systems in collaborative settings, such as semi-autonomous driving, evaluated through a human subject study. I will conclude with key takeaways and future directions for designing AI systems that effectively support and interact with humans.

Unintended Consequences of Recommender Systems

Vicenç Gómez Universitat Pompeu Fabra
(hosted by Manuel Gomez Rodriguez)

11 Aug 2025, 10:30 am - 11:30 am

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

From LLMs Algorithmic ranking systems shape online interactions and are linked to engagement misinformation and polarization. This talk explores the unintended consequences of popularity-based rankings drawing on two studies that combine observational data mathematical modeling and experimental validation. The first examines how engagement-driven algorithms may fuel polarization and ideological extremism. The second reveals a "few-get-richer" effect where ranking dynamics amplify minority signals. Together these insights highlight the societal impact of human-AI feedback loops.

•Ranking for engagement: How social media algorithms fuel misinformation and polarization. ...

From LLMs Algorithmic ranking systems shape online interactions and are linked to engagement misinformation and polarization. This talk explores the unintended consequences of popularity-based rankings drawing on two studies that combine observational data mathematical modeling and experimental validation. The first examines how engagement-driven algorithms may fuel polarization and ideological extremism. The second reveals a "few-get-richer" effect where ranking dynamics amplify minority signals. Together these insights highlight the societal impact of human-AI feedback loops.

•Ranking for engagement: How social media algorithms fuel misinformation and polarization. F Germano V Gómez F Sobbrio. CESifo Working Paper •The few-get-richer: a surprising consequence of popularity-based rankings. F Germano V Gómez GL MensThe Web Conference WWW' 19

As we shall see, LLMs like GPT4 can model group behavior — or stereotypical patterns — fairly well, but their performance drops when it comes to smaller groups, below a certain threshold. This limitation has interesting implications not just for personalization systems, but also for questions in social science, including how we define culture, interpret collective behavior, and think about individual agency.

High-Assurance Verification of Low-Level Rust Programs

Lennard Gäher Max Planck Institute for Software Systems

05 Aug 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

Rust is a modern systems programming language whose ownership-based type system statically guarantees memory safety, making it particularly well-suited to the domain of safety-critical systems. In recent years, a multitude of automated deductive verification tools have emerged for establishing functional correctness of Rust code. However, before my thesis work, none of the tools produced foundational proofs (machine-checkable with a small trusted computing base), and all of them were restricted to the safe fragment of Rust. This is a problem because the vast majority of Rust programs make use of unsafe code at critical points: either indirectly through libraries, ...

Rust is a modern systems programming language whose ownership-based type system statically guarantees memory safety, making it particularly well-suited to the domain of safety-critical systems. In recent years, a multitude of automated deductive verification tools have emerged for establishing functional correctness of Rust code. However, before my thesis work, none of the tools produced foundational proofs (machine-checkable with a small trusted computing base), and all of them were restricted to the safe fragment of Rust. This is a problem because the vast majority of Rust programs make use of unsafe code at critical points: either indirectly through libraries, which are often implemented with unsafe code, or directly, which is particularly common for low-level systems code.

With my thesis, I will present work on RefinedRust, a foundational verification tool, aiming to semi-automatically establish functional correctness of both safe and unsafe Rust code. RefinedRust builds a refinement type system for Rust that extends Rust's type system for safe Rust to functional correctness reasoning and unsafe Rust, and, inspired by previous work on RefinedC, automates proofs in this type system inside the Rocq Prover. RefinedRust's type system is semantically proven sound using the Iris separation logic framework, providing high assurances about its correctness. In the process, RefinedRust significantly advances the state of the art of semantic soundness proofs for Rust, scaling previous work on RustBelt to a much more practical and realistic model.

One particularly interesting application for RefinedRust is low-level systems code, which is inherently unsafe and deserving of high-assurance foundational verification. In my thesis, I will present work on verifying interesting parts of the ACE security monitor, which is part of the ACE trusted execution environment (TEE) developed by IBM Research and thus a low-level security-critical component. For the final part of my thesis, I propose to work on one of two projects that will advance our capabilities in verifying systems code like ACE. I will either show how to extend RefinedRust to prove non-interference (a security property) of systems code, or show how to link up RefinedRust with hardware models to prove a fundamental isolation property that ACE should guarantee.

Facilitating Secure Data Analytics

Roberta De Viti Max Planck Institute for Software Systems

25 Jul 2025, 5:00 pm - 6:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

There is growing awareness that the statistical analysis of personal data, such as individual mobility, financial, and health data, could be of immense benefit to society. However, liberal societies have refrained from such analysis, arguably due to the lack of trusted analytics platforms that scale to billions of records and reliably prevent the leakage and misuse of personal data. The first part of this proposal presents CoVault, an analytics platform that leverages secure multi-party computation (MPC) and trusted execution environments (TEEs) to perform analytics securely at scale. ...

There is growing awareness that the statistical analysis of personal data, such as individual mobility, financial, and health data, could be of immense benefit to society. However, liberal societies have refrained from such analysis, arguably due to the lack of trusted analytics platforms that scale to billions of records and reliably prevent the leakage and misuse of personal data. The first part of this proposal presents CoVault, an analytics platform that leverages secure multi-party computation (MPC) and trusted execution environments (TEEs) to perform analytics securely at scale. CoVault co-locates MPC's independent parties in the same datacenter, providing high bisection bandwidth without reducing security, allowing analytics to scale horizontally to the datacenter’s available resources. CoVault's empirical evaluation shows that nation-scale, secure analytics is feasible with modest resources. For example, country-scale epidemic analytics on a continuous basis requires only one pair of CPU cores for every 30,000 people. The second part of the proposal discusses how to add support for semantic queries to CoVault by building on secure similarity search on high-dimensional vector embeddings.

What is Emerging in AI Systems?

Daria Kim MPI for Competition and Innovation
(hosted by Krishna Gummadi)

16 Jul 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

The concept of emergence is frequently invoked in discussions on artificial intelligence (AI) to describe seemingly novel, unpredictable, and autonomous system behaviours. These narratives often anthropomorphise AI systems, attributing to them intelligence, agency, and even creativity – fuelling public fascination and intensifying normative uncertainty. However, a robust normative and legal analysis requires a precise conceptualisation of emergence that distinguishes between technical explanations, subjective projections, and philosophical perspectives. Do systems based on artificial neural networks (ANNs) acquire emergent properties? ...

The concept of emergence is frequently invoked in discussions on artificial intelligence (AI) to describe seemingly novel, unpredictable, and autonomous system behaviours. These narratives often anthropomorphise AI systems, attributing to them intelligence, agency, and even creativity – fuelling public fascination and intensifying normative uncertainty. However, a robust normative and legal analysis requires a precise conceptualisation of emergence that distinguishes between technical explanations, subjective projections, and philosophical perspectives. Do systems based on artificial neural networks (ANNs) acquire emergent properties? Drawing on complexity theory, I will argue that the predictive capacities of ANNs correspond to weak emergence, which is explained by interactions of the structural components within ANNs and the conditions of model deployment, rather than indicating ‘superimposed’, ‘autonomous’ agency. This understanding challenges agentic interpretations of AI functionalities, highlights the risks of over-attributing human-like traits to AI, and redirects attention to human causality and responsibility in AI-mediated outcomes. The discussion will further explore the implications of a reductionist understanding of AI emergence for the design of AI governance frameworks, and emphasise the role of human agency in AI-driven processes. In particular, this view can inform the ongoing debates on AI risk regulation, civil liability, the ethical boundaries of AI-driven decision-making, and reward mechanisms such as intellectual property rights. Overall, the analysis aspires to contribute to a conceptual shift – from the mythologisation of AI capabilities toward a technically grounded normative analysis.

You’re Not the Average: LLMs, Stereotypes, and the Edges of Personalization

Monojit Choudhury Mohamed bin Zayed University of Artificial Intelligence
(hosted by Krishna Gummadi)

11 Jul 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

From LLMs can imitate humans and therefore, pass the Turing test – this aint a news today. However, can they model not just a human, but you as an individual? This talk dives into the core of personalization: how well can large language models predict individual user behavior, based on their history and context? Since LLMs are trained to generate sequential data — and that data comes from humans — there's reason to believe they can mirror our choices, ...

From LLMs can imitate humans and therefore, pass the Turing test – this aint a news today. However, can they model not just a human, but you as an individual? This talk dives into the core of personalization: how well can large language models predict individual user behavior, based on their history and context? Since LLMs are trained to generate sequential data — and that data comes from humans — there's reason to believe they can mirror our choices, preferences, even our quirks. We explore this in two lights:

•Positive prediction: what users are likely to do, prefer, or know — the classic personalization setup. •Negative prediction: what users are unlikely to do or understand — the shadow side of modeling.

As we shall see, LLMs like GPT4 can model group behavior — or stereotypical patterns — fairly well, but their performance drops when it comes to smaller groups, below a certain threshold. This limitation has interesting implications not just for personalization systems, but also for questions in social science, including how we define culture, interpret collective behavior, and think about individual agency.

Internet Measurements for Security

Tiago Heinrich MPI-INF - INET

02 Jul 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Today’s Internet is responsible for connecting billions of end-points that interact by using a variety of protocols. Malicious actors can carry out attacks by exploiting users, applications, or the network. Typical attacks are two steps process: The identification of potential victims by periodically scanning the Internet for a specific service; and the attack itself where the exploitation of the target service is performed.

To protect systems on the Internet, we need to understand how vulnerable systems are being targeted, ...

Today’s Internet is responsible for connecting billions of end-points that interact by using a variety of protocols. Malicious actors can carry out attacks by exploiting users, applications, or the network. Typical attacks are two steps process: The identification of potential victims by periodically scanning the Internet for a specific service; and the attack itself where the exploitation of the target service is performed.

To protect systems on the Internet, we need to understand how vulnerable systems are being targeted, how malicious actors exploit end-points, and how attackers' behavior evolves over time. Network measurements offer valuable resources to study such behaviors. This talk highlights how Internet measurements can be used to improve security, and which factors have to be considered in this research direction.

Safety Alignment of LLMs: From code-based attacks to culture-specific safety

Animesh Mukherjee IIT Kharagpur
(hosted by Krishna Gummadi)

20 Jun 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

In this talk, I will present a series of collaborative efforts—developed with my students and colleagues at Microsoft India—on designing robust safety interventions in large language models (LLMs). Our first study, TechHazardQA (to appear at ICWSM 2025), shows how instruction-form prompts, such as code or pseudocode, can bypass content filters that typically block harmful text queries. We propose a mitigation strategy called the SafeInfer (AAAI 2025), leveraging function vectors and controlled text generation via model arithmetic to steer models away from harmful outputs. ...

In this talk, I will present a series of collaborative efforts—developed with my students and colleagues at Microsoft India—on designing robust safety interventions in large language models (LLMs). Our first study, TechHazardQA (to appear at ICWSM 2025), shows how instruction-form prompts, such as code or pseudocode, can bypass content filters that typically block harmful text queries. We propose a mitigation strategy called the SafeInfer (AAAI 2025), leveraging function vectors and controlled text generation via model arithmetic to steer models away from harmful outputs. We extend this safety strategy to multilingual settings by introducing language-specific safety vectors. Applied across several languages, this approach yields a notable reduction in attack success rates, from over 60% to below 10%. While text is one of the most dominant forms of communication on social media platforms, the internet meme culture is rapidly becoming viral. We show how vision language models (VLMs) that are routinely used as backbones for multimodal content moderation by social media platforms are often ineffective in tracking toxic memes. We tease apart the reasons for this failure using a rigorous interpretability scheme (to appear at ICWSM 2025). Presently, we are working on building intervention techniques for both toxic memes and audio podcasts. Next, I will discuss cultural sensitivities in generative models, demonstrating that the same prompt can elicit significantly different degrees of harm when framed in diverse cultural contexts. We propose a lightweight, preference-informed realignment technique that reduces culturally conditioned harmful responses across 11 regions (NAACL 2025).

Natural Language Processing for the Legal Domain: Challenges and Recent Developments

Saptarshi Ghosh Indian Institute of Technology, Kharagpur
(hosted by Krishna Gummadi)

18 Jun 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

The field of Law has become an important application domain of Natural Language Processing (NLP) due to the recent proliferation of publicly available legal data, and the socio-economic benefits of mining legal insights. Additionally, the introduction of Large Language Models (LLMs) has brought forth many applications, questions, and concerns in the legal domain. In this talk, I will discuss some of the challenges in processing of legal text, and our works in some practical research problems, ...

The field of Law has become an important application domain of Natural Language Processing (NLP) due to the recent proliferation of publicly available legal data, and the socio-economic benefits of mining legal insights. Additionally, the introduction of Large Language Models (LLMs) has brought forth many applications, questions, and concerns in the legal domain. In this talk, I will discuss some of the challenges in processing of legal text, and our works in some practical research problems, including summarization of long legal documents, and identifying relevant statutes from fact descriptions. I will also discuss our efforts towards making Law-AI more accessible in the Indian society, through developing the first datasets for several problems (available at https://github.com/Law-AI/) and the first pre-trained language model for the Indian legal domain(https://huggingface.co/law-ai/InLegalBERT) that is being widely used by both the academia and the industry, and has more than 1.5 million downloads till date.

Specifying and Fuzzing Machine-Learning Models

Hasan F. Eniser Max Planck Institute for Software Systems

13 Jun 2025, 2:00 pm - 3:00 pm

Saarbrücken building G26, room 111

SWS Student Defense Talks - Thesis Defense

Machine-Learning (ML) models are increasingly integrated into safety-critical systems, from self-driving cars to aviation, making their dependability assessment crucial. This thesis introduces novel approaches to specify and test the functional correctness of ML artifacts by adapting established software testing concepts. We first address the challenge of testing action policies in sequential decision-making problems by developing π-fuzz, a framework that uses metamorphic relations between states to identify undesirable yet avoidable outcomes. We then formalize these relations as k-safety hyperproperties and introduce NOMOS, ...

Machine-Learning (ML) models are increasingly integrated into safety-critical systems, from self-driving cars to aviation, making their dependability assessment crucial. This thesis introduces novel approaches to specify and test the functional correctness of ML artifacts by adapting established software testing concepts. We first address the challenge of testing action policies in sequential decision-making problems by developing π-fuzz, a framework that uses metamorphic relations between states to identify undesirable yet avoidable outcomes. We then formalize these relations as k-safety hyperproperties and introduce NOMOS, a domain-agnostic specification language for expressing functional correctness properties of ML models. NOMOS comes with an automated testing framework that effectively identifies bugs across diverse domains including image classification, sentiment analysis, and speech recognition. We further extend NOMOS to evaluate code translation models.

By providing these specification languages and testing frameworks, this thesis contributes essential tools for validating the reliability and safety of ML models in our increasingly machine-learning-dependent world.

System and Network Operations Through a Sociotechnical Lens: The human aspects of running digital systems

Mannat Kaur MPI-INF - INET

04 Jun 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Digital infrastructure is critical to modern society, and a great deal of work goes into ensuring the smooth and continuous operation of systems and networks. This essential labor is carried out by system and network operators. Yet, their work often remains invisible and undervalued—especially when everything appears to function as expected. System operators engage not only in a wide range of technical tasks but also in social and organizational work, such as coordinating with colleagues and helping system users. ...

Digital infrastructure is critical to modern society, and a great deal of work goes into ensuring the smooth and continuous operation of systems and networks. This essential labor is carried out by system and network operators. Yet, their work often remains invisible and undervalued—especially when everything appears to function as expected. System operators engage not only in a wide range of technical tasks but also in social and organizational work, such as coordinating with colleagues and helping system users. Their everyday practices directly shape the security posture of their organizations. However, when failures occur, system administrators are frequently blamed for misconfiguration or other types of "human error". Decades of research in human factors demonstrate that focusing on human error alone is insufficient to improve operational security. Instead, it diverts attention from the sociotechnical complexities of these environments and from supporting people in doing their work effectively. This talk will highlight the human dimensions of system operations by drawing on historical perspectives and emphasizing the sociotechnical factors essential to sustaining and securing digital infrastructure.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Need help?