Events 2025 – MPI SWS

Illuminating Generative AI: Mapping Knowledge in Large Language Models

Abhilasha Ravichander Max Planck Institute for Software Systems

03 Dec 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Millions of everyday users are interacting with technologies built with generative AI, such as voice assistants, search engines, and chatbots. While these AI-based systems are being increasingly integrated into modern life, they can also magnify risks, inequities, and dissatisfaction when providers deploy unreliable systems. A primary obstacle to having more reliable systems is the opacity of the underlying large language models— we lack a systematic understanding of how models work, where critical vulnerabilities may arise, why they are happening, ...

Millions of everyday users are interacting with technologies built with generative AI, such as voice assistants, search engines, and chatbots. While these AI-based systems are being increasingly integrated into modern life, they can also magnify risks, inequities, and dissatisfaction when providers deploy unreliable systems. A primary obstacle to having more reliable systems is the opacity of the underlying large language models— we lack a systematic understanding of how models work, where critical vulnerabilities may arise, why they are happening, and how models must be redesigned to address them. In this talk, I will first describe my work in investigating large language models to illuminate when models acquire knowledge and capabilities. Then, I will describe my work on building methods to enable data transparency for large language models, that allows practitioners to make sense of the information available to models. Finally, I will describe work on understanding why large language models produce incorrect knowledge, and implications for building the next generation of responsible AI systems.

How to Manage a Hotel Desk? Stable Perfect Hashing in the Incremental Setting

Guy Even MPI-INF - D1

05 Nov 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Many modern applications—from large-scale databases to network routers and genome repositories—depend on maintaining large dynamic sets of elements. Efficient management of these sets requires data structures that can quickly support insertions and deletions, answer queries such as "Is this element in the set?" or "What is the value associated with this element?", and assign distinct short keys to elements as the set grows.

The field of data structures is concerned with specifying functionality, abstracting computational models, ...

Many modern applications—from large-scale databases to network routers and genome repositories—depend on maintaining large dynamic sets of elements. Efficient management of these sets requires data structures that can quickly support insertions and deletions, answer queries such as "Is this element in the set?" or "What is the value associated with this element?", and assign distinct short keys to elements as the set grows.

The field of data structures is concerned with specifying functionality, abstracting computational models, designing efficient representations, and analyzing the running time and memory requirements of algorithms over these representations. Classical data structures developed for representing sets include dictionaries, retrieval data structures, filters, and perfect hashing.

In this talk, I will explore these issues through the lens of perfect hashing, a method for assigning each element a distinct identifier, or hashcode, with no collisions. We will focus on how to simultaneously satisfy several competing design goals:

Small space: using near-optimal memory proportional to the set’s size.

Fast operations: supporting constant-time insertions, deletions, and queries.

Low redundancy: keeping the range of hashcodes close to the set’s size.

Stability: ensuring that each element’s hashcode remains unchanged while it stays in the set.

Extendability: adapting automatically to unknown or growing data sizes.

This talk is based on joint work with Ioana Bercea.

Accountable Multi-Agent Sequential Decision Making

Stelios Triantafyllou Max Planck Institute for Software Systems

30 Oct 2025, 11:00 am - 12:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

As AI agents increasingly engage in high-stakes decision making, it is essential to assess their accountability in ways that are both fair and interpretable. This involves explaining expected or realized outcomes of multi-agent systems and attributing responsibility for those outcomes to the participating agents. Addressing these challenges is key to fostering societal trust and easing the adoption of AI decision makers. This thesis investigates accountability in multi-agent sequential decision making. We develop methods to attribute responsibility for observed outcomes and overall system performance, ...

As AI agents increasingly engage in high-stakes decision making, it is essential to assess their accountability in ways that are both fair and interpretable. This involves explaining expected or realized outcomes of multi-agent systems and attributing responsibility for those outcomes to the participating agents. Addressing these challenges is key to fostering societal trust and easing the adoption of AI decision makers. This thesis investigates accountability in multi-agent sequential decision making. We develop methods to attribute responsibility for observed outcomes and overall system performance, design efficient approximation algorithms for otherwise intractable attribution problems, and introduce causal tools to explain how agents’ decisions influence outcomes. Together, these contributions establish theoretical foundations and practical tools for accountable decision making, drawing on and integrating insights from causality, multi-agent reinforcement learning and game theory.

A Logical Foundation For Multi-Language Interoperability

Brigitte Pientka McGill University
(hosted by Derek Dreyer)

28 Oct 2025, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029

SWS Colloquium

Today’s software systems are complex and often made up of parts written in different programming languages with different computational and memory management strategies. This allows programmers to combine different languages and choose the most suitable one for a given problem. It also allows the gradual migration of existing projects from one language to another, or to reuse existing source code.

While this flexibility offers clear advantages, it also introduces significant challenges, as different programming languages may have fundamentally different implementations and may use different runtime environments, ...

Today’s software systems are complex and often made up of parts written in different programming languages with different computational and memory management strategies. This allows programmers to combine different languages and choose the most suitable one for a given problem. It also allows the gradual migration of existing projects from one language to another, or to reuse existing source code.

While this flexibility offers clear advantages, it also introduces significant challenges, as different programming languages may have fundamentally different implementations and may use different runtime environments, which are hard to combine. As a consequence composing parts written in different languages often results in complex interfaces between languages, insufficient flexibility, poor performance, and hard to diagnose errors. This lack of interoperability support can lead to subtle bugs and security vulnerabilities, especially in large or long-lived systems.

Existing foundations for interoperability often assume that we compile languages into a common low-level language. This is however not always realistic. We propose a logical foundation for interoperability where we retain the static and operational semantics of each part. It is grounded in adjoint logic -- a logic that unifies a wide collection of logics through the up-shift and down-shift modalities. We give a Curry-Howard interpretation of this logic where we use the down-shift modality to model foreign function calls, and the up-shift modality to model runtime code generation and execution. Our system is parametric to a user-defined collection of languages and their accessibility relation, which controls how languages interact with each other, allowing the interoperability of various formations of languages. We sketch the statics and an operational semantics together with properties such as accessibility safety, which ensures that languages respect their user-defined boundaries, alongside type safety. Finally, it time permits, we outline how we have used this foundation to reason about the interoperability between languages with fundamentally different runtime implementations such as the interoperability between a quantum and a purely functional language.

Counterfactual Reasoning and Uncertainty Quantification for AI-Assisted Decision Making

Nina Corvelo Benz Max Planck Institute for Software Systems

15 Oct 2025, 4:00 pm - 5:00 pm

Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

Artificial intelligence (AI) systems are increasingly being used to support human experts in various domains such as healthcare, education, and the judicial system. The aim of these systems is complementarity—leveraging the strengths of each side, human and AI, to compensate for the weaknesses of the other. In most such systems, the human expert makes decisions based on a prediction by the AI model and their own judgment. However, models designed for automated decision making are typically trained in isolation and do not take into account the human decision maker when making predictions. ...

Artificial intelligence (AI) systems are increasingly being used to support human experts in various domains such as healthcare, education, and the judicial system. The aim of these systems is complementarity—leveraging the strengths of each side, human and AI, to compensate for the weaknesses of the other. In most such systems, the human expert makes decisions based on a prediction by the AI model and their own judgment. However, models designed for automated decision making are typically trained in isolation and do not take into account the human decision maker when making predictions. As a result, when these AI models are used in decision support systems, their predictions may not be helpful, undermining the human expert’s trust in the AI model and leading to no improvement in their decisions. To address this, this thesis focuses on the design of AI-based decision support systems that leverage the interaction with the expert through counterfactual reasoning and uncertainty quantification. It proposes decision support systems for three distinct decision-making contexts, where each one is based on a novel methodological approach and is evaluated with experiments using real-world data or a human subject study.

Can machine learning revolutionize biomarker discovery?

Karsten Borgwardt Max-Planck-Institut für Biochemie
(hosted by Manuel Gomez Rodriguez)

15 Oct 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

Machine learning has transformed many areas of science and technology, including the life sciences, most prominently through its breakthrough impact on protein structure prediction, recognized by the 2024 Nobel Prize in Chemistry. An open question, however, is whether machine learning can have a similarly profound impact on biomarker discovery, that is, the identification of biological properties that predict system functions or phenotypes. Biomarker discovery is a key topic for advancing biology and medicine. In this talk, ...

Machine learning has transformed many areas of science and technology, including the life sciences, most prominently through its breakthrough impact on protein structure prediction, recognized by the 2024 Nobel Prize in Chemistry. An open question, however, is whether machine learning can have a similarly profound impact on biomarker discovery, that is, the identification of biological properties that predict system functions or phenotypes. Biomarker discovery is a key topic for advancing biology and medicine. In this talk, I will present our efforts to harness machine learning for biomarker discovery, summarize our algorithmic contributions, and discuss the opportunities and challenges in this field.

From Exploits to Defenses: Building Trustworthy Digital Systems

Thorsten Holz Max Planck Institute for Security and Privacy
(hosted by Krishna Gummadi)

17 Sep 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

Building trustworthy software systems has become increasingly challenging as complexity grows across the hardware-software stack. Adversaries exploit sophisticated techniques such as return-oriented programming and timing side channels to bypass traditional defenses and compromise critical components. This talk examines these classes of low-level attacks and presents defenses we have developed, including control-flow integrity mechanisms and memory tagging. I will further discuss how automated approaches such as fuzzing can help us to systematically expose latent vulnerabilities and strengthen the design of security-critical systems, ...

Building trustworthy software systems has become increasingly challenging as complexity grows across the hardware-software stack. Adversaries exploit sophisticated techniques such as return-oriented programming and timing side channels to bypass traditional defenses and compromise critical components. This talk examines these classes of low-level attacks and presents defenses we have developed, including control-flow integrity mechanisms and memory tagging. I will further discuss how automated approaches such as fuzzing can help us to systematically expose latent vulnerabilities and strengthen the design of security-critical systems, aiming for resilience against both current and emerging threats. I will conclude with an overview of future challenges.

The fine-grained complexity of NFA intersection emptiness

Neha Rino University of Warwick

12 Sep 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

SWS Colloquium

Given some integer k, intersection emptiness of k Nondeterministic Finite Automata (NFA k-IE) is a fundamental problem in automata theory, with applications across Computer science from Arithmetic theories and model checking to graph database queries. In this talk, I will discuss some results regarding the fine-grained complexity of NFA k-IE. Informally, what I mean by fine-grained complexity is that we want to know (upto logarithmic factors) the runtime complexity of our algorithms, and argue why they cannot be improved by relating it to the state of the art in solving long-standing hard problems like SAT. ...

Given some integer k, intersection emptiness of k Nondeterministic Finite Automata (NFA k-IE) is a fundamental problem in automata theory, with applications across Computer science from Arithmetic theories and model checking to graph database queries. In this talk, I will discuss some results regarding the fine-grained complexity of NFA k-IE. Informally, what I mean by fine-grained complexity is that we want to know (upto logarithmic factors) the runtime complexity of our algorithms, and argue why they cannot be improved by relating it to the state of the art in solving long-standing hard problems like SAT.

I will demonstrate (what we believe to be) a new algorithm for solving NFA k-IE. If all the NFAs have n states and m transitions, our algorithm runs in time O(n^{k-1}m), compared to the O(m^k) runtime of the classic Cartesian product approach. I will also present a matching lower bound subject to the Combinatorial k-Clique hypothesis, and a barrier to tight SETH-based lower bounds. This is joint work with Dmitry Chistikov, at the University of Warwick, UK.

Permissive Assumptions in Logical Controller Synthesis for Cyber-Physical Systems

Satya Prakash Nayak Max Planck Institute for Software Systems

09 Sep 2025, 1:00 pm - 2:00 pm

Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Proposal

The automatic construction of correct-by-design systems has emerged as a central challenge in cyber-physical systems (CPS). As CPS combine discrete logical decision-making with continuous physical processes, their correctness requires reasoning both about logical specifications and real-time physical behavior. A common approach is to abstract the physical dynamics into a discrete plant model and synthesize a logical controller for this plant that ensures the specification. Such approaches often rely on a set of assumptions about how the system interacts with its environment. ...

The automatic construction of correct-by-design systems has emerged as a central challenge in cyber-physical systems (CPS). As CPS combine discrete logical decision-making with continuous physical processes, their correctness requires reasoning both about logical specifications and real-time physical behavior. A common approach is to abstract the physical dynamics into a discrete plant model and synthesize a logical controller for this plant that ensures the specification. Such approaches often rely on a set of assumptions about how the system interacts with its environment. However, these assumptions are typically overly restrictive and fail to capture the full range of behaviors that the environment can exhibit, leading to conservative or inflexible controllers. This thesis addresses these limitations by proposing frameworks for synthesizing controllers under more permissive assumptions.

The first part of the thesis considers permissiveness in the interactions between multiple discrete logical components, where assumptions restrict the behavior of other components. We propose new methods to compute permissive assumptions that capture all cooperative behavior of other components. Building on this, we propose a negotiation-based approach to compute assume-guarantee contracts between components, allowing components to retain flexibility while ensuring correctness in a distributed setting.

The second part of the thesis addresses permissiveness in the interaction between high-level logic and low-level physical dynamics, where assumptions on the plant model restrict the behavior of the physical environment. We develop a new class of assumptions that capture richer behaviors of low-level controllers, enabling logical controllers to adapt seamlessly to changes by the external environment. To ensure scalability for large plant models, we propose a universal controller framework where controller decisions are conditioned on future branching-time properties of the plant model, learned from a small set of representative plant models. Finally, we introduce a robust semantics for branching-time temporal logics that allows reasoning under uncertainty or partial violations of assumptions without increasing computational complexity.

Algorithmic Problems for Linear Recurrence Sequences

Joris Nieuwveld Max Planck Institute for Software Systems

05 Sep 2025, 3:00 pm - 4:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Defense

Linear recurrence sequences (LRS) are among the most fundamental and easily definable classes of number sequences, encompassing many classical sequences such as polynomials, powers of two, and the Fibonacci numbers. They also describe the dynamics of iterated linear maps and arise naturally in numerous contexts within computer science, mathematics, and other quantitive sciences. However, despite their simplicity, many easy-to-state decision problems for LRS have stubbornly remained open for decades despite considerable and sustained attention. Chief among these are the Skolem problem and the Positivity problem, ...

Linear recurrence sequences (LRS) are among the most fundamental and easily definable classes of number sequences, encompassing many classical sequences such as polynomials, powers of two, and the Fibonacci numbers. They also describe the dynamics of iterated linear maps and arise naturally in numerous contexts within computer science, mathematics, and other quantitive sciences. However, despite their simplicity, many easy-to-state decision problems for LRS have stubbornly remained open for decades despite considerable and sustained attention. Chief among these are the Skolem problem and the Positivity problem, which ask to determine, for a given LRS, whether it contains a zero term and whether it contains only positive terms, respectively. For both problems, decidability is currently open, i.e., whether they are algorithmically solvable.

In this thesis, we present the following results. For the Skolem problem, we introduce an algorithm for simple LRS whose correctness is unconditional but whose termination relies on two classical, widely-believed number-theoretic conjectures. This algorithm is implementable in practice, and we report on experimental results. For the Positivity problem, we introduce the notion of reversible LRS, which enables us to carve out a large decidable class of sequences. We also examine various expansions of classical logics by predicates obtained from LRS. In particular, we study expansions of monadic second-order logic of the natural numbers with order and present major advances over the seminal results of Büchi, Elgot, and Rabin from the early 1960s. Finally, we investigate fragments of Presburger arithmetic, where, among others, we establish the decidability of the existential fragment of Presburger arithmetic expanded with powers of 2 and 3.

Supporting Human-Human Communication: Towards a Proactive AI Paradigm

Christian Danescu-Niculescu-Mizil Cornell University
(hosted by Krishna Gummadi)

05 Sep 2025, 1:30 pm - 2:30 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

Recent years have seen a gold rush towards replacing people with AI agents in communication: they can serve as your therapist, your tutor, your financial advisor, your interviewer. In this talk I will propose a contrasting vision: one where AI is used for supporting humans in their communication while preserving their agency. Achieving this vision requires moving beyond the current transactional paradigm embodied by current generative AI systems, which are designed to fulfill the immediate goals of a single person, ...

Recent years have seen a gold rush towards replacing people with AI agents in communication: they can serve as your therapist, your tutor, your financial advisor, your interviewer. In this talk I will propose a contrasting vision: one where AI is used for supporting humans in their communication while preserving their agency. Achieving this vision requires moving beyond the current transactional paradigm embodied by current generative AI systems, which are designed to fulfill the immediate goals of a single person, such as answering a question, solving a math problem, booking a flight, or (repeatedly) replying in character. To meaningfully support human-human communication without disrupting or supplanting it, an AI system must instead follow a proactive paradigm: it needs to decide when to intervene to offer support as the interaction unfolds, rather than wait to explicitly be prompted as AI agents and chatbots do today. In this talk I will present initial progress on AI technologies that enable such a proactive mode of operation, and demonstrate communication support tools that embody it.

Strategic and counterfactual reasoning in AI-assisted decision making

Efstratios Tsirtsis Max Planck Institute for Software Systems

19 Aug 2025, 2:30 pm - 3:30 pm

Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

From finance and healthcare to criminal justice and transportation, many domains that involve high-stakes decisions, traditionally made by humans, are increasingly integrating artificial intelligence (AI) systems into their decision making pipelines. While recent advances in machine learning and optimization have given rise to AI systems with unprecedented capabilities, fully automating such decisions is often undesirable. Instead, a promising direction lies in AI-assisted decision making, where AI informs or complements human decisions without completely removing human oversight. ...

From finance and healthcare to criminal justice and transportation, many domains that involve high-stakes decisions, traditionally made by humans, are increasingly integrating artificial intelligence (AI) systems into their decision making pipelines. While recent advances in machine learning and optimization have given rise to AI systems with unprecedented capabilities, fully automating such decisions is often undesirable. Instead, a promising direction lies in AI-assisted decision making, where AI informs or complements human decisions without completely removing human oversight. In this talk, I will present my PhD work on AI-assisted decision making in settings where humans rely on two core cognitive capabilities: strategic reasoning and counterfactual reasoning. First, I will introduce game-theoretic methods for supporting policy design in strategic environments, enabling a decision maker to allocate resources (e.g., loans) to individuals who adapt their behavior in response to transparency regarding the decision policy. Next, I will present methods to enhance a decision maker’s counterfactual reasoning process— identifying key past decisions (e.g., in clinical treatments) which, if changed, could have improved outcomes and, hence, serve as valuable learning signals. Finally, I will discuss a computational model of how people attribute responsibility between humans and AI systems in collaborative settings, such as semi-autonomous driving, evaluated through a human subject study. I will conclude with key takeaways and future directions for designing AI systems that effectively support and interact with humans.

Unintended Consequences of Recommender Systems

Vicenç Gómez Universitat Pompeu Fabra
(hosted by Manuel Gomez Rodriguez)

11 Aug 2025, 10:30 am - 11:30 am

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

From LLMs Algorithmic ranking systems shape online interactions and are linked to engagement misinformation and polarization. This talk explores the unintended consequences of popularity-based rankings drawing on two studies that combine observational data mathematical modeling and experimental validation. The first examines how engagement-driven algorithms may fuel polarization and ideological extremism. The second reveals a "few-get-richer" effect where ranking dynamics amplify minority signals. Together these insights highlight the societal impact of human-AI feedback loops.

•Ranking for engagement: How social media algorithms fuel misinformation and polarization. ...

From LLMs Algorithmic ranking systems shape online interactions and are linked to engagement misinformation and polarization. This talk explores the unintended consequences of popularity-based rankings drawing on two studies that combine observational data mathematical modeling and experimental validation. The first examines how engagement-driven algorithms may fuel polarization and ideological extremism. The second reveals a "few-get-richer" effect where ranking dynamics amplify minority signals. Together these insights highlight the societal impact of human-AI feedback loops.

•Ranking for engagement: How social media algorithms fuel misinformation and polarization. F Germano V Gómez F Sobbrio. CESifo Working Paper •The few-get-richer: a surprising consequence of popularity-based rankings. F Germano V Gómez GL MensThe Web Conference WWW' 19

As we shall see, LLMs like GPT4 can model group behavior — or stereotypical patterns — fairly well, but their performance drops when it comes to smaller groups, below a certain threshold. This limitation has interesting implications not just for personalization systems, but also for questions in social science, including how we define culture, interpret collective behavior, and think about individual agency.

High-Assurance Verification of Low-Level Rust Programs

Lennard Gäher Max Planck Institute for Software Systems

05 Aug 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

Rust is a modern systems programming language whose ownership-based type system statically guarantees memory safety, making it particularly well-suited to the domain of safety-critical systems. In recent years, a multitude of automated deductive verification tools have emerged for establishing functional correctness of Rust code. However, before my thesis work, none of the tools produced foundational proofs (machine-checkable with a small trusted computing base), and all of them were restricted to the safe fragment of Rust. This is a problem because the vast majority of Rust programs make use of unsafe code at critical points: either indirectly through libraries, ...

Rust is a modern systems programming language whose ownership-based type system statically guarantees memory safety, making it particularly well-suited to the domain of safety-critical systems. In recent years, a multitude of automated deductive verification tools have emerged for establishing functional correctness of Rust code. However, before my thesis work, none of the tools produced foundational proofs (machine-checkable with a small trusted computing base), and all of them were restricted to the safe fragment of Rust. This is a problem because the vast majority of Rust programs make use of unsafe code at critical points: either indirectly through libraries, which are often implemented with unsafe code, or directly, which is particularly common for low-level systems code.

With my thesis, I will present work on RefinedRust, a foundational verification tool, aiming to semi-automatically establish functional correctness of both safe and unsafe Rust code. RefinedRust builds a refinement type system for Rust that extends Rust's type system for safe Rust to functional correctness reasoning and unsafe Rust, and, inspired by previous work on RefinedC, automates proofs in this type system inside the Rocq Prover. RefinedRust's type system is semantically proven sound using the Iris separation logic framework, providing high assurances about its correctness. In the process, RefinedRust significantly advances the state of the art of semantic soundness proofs for Rust, scaling previous work on RustBelt to a much more practical and realistic model.

One particularly interesting application for RefinedRust is low-level systems code, which is inherently unsafe and deserving of high-assurance foundational verification. In my thesis, I will present work on verifying interesting parts of the ACE security monitor, which is part of the ACE trusted execution environment (TEE) developed by IBM Research and thus a low-level security-critical component. For the final part of my thesis, I propose to work on one of two projects that will advance our capabilities in verifying systems code like ACE. I will either show how to extend RefinedRust to prove non-interference (a security property) of systems code, or show how to link up RefinedRust with hardware models to prove a fundamental isolation property that ACE should guarantee.

Facilitating Secure Data Analytics

Roberta De Viti Max Planck Institute for Software Systems

25 Jul 2025, 5:00 pm - 6:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

There is growing awareness that the statistical analysis of personal data, such as individual mobility, financial, and health data, could be of immense benefit to society. However, liberal societies have refrained from such analysis, arguably due to the lack of trusted analytics platforms that scale to billions of records and reliably prevent the leakage and misuse of personal data. The first part of this proposal presents CoVault, an analytics platform that leverages secure multi-party computation (MPC) and trusted execution environments (TEEs) to perform analytics securely at scale. ...

There is growing awareness that the statistical analysis of personal data, such as individual mobility, financial, and health data, could be of immense benefit to society. However, liberal societies have refrained from such analysis, arguably due to the lack of trusted analytics platforms that scale to billions of records and reliably prevent the leakage and misuse of personal data. The first part of this proposal presents CoVault, an analytics platform that leverages secure multi-party computation (MPC) and trusted execution environments (TEEs) to perform analytics securely at scale. CoVault co-locates MPC's independent parties in the same datacenter, providing high bisection bandwidth without reducing security, allowing analytics to scale horizontally to the datacenter’s available resources. CoVault's empirical evaluation shows that nation-scale, secure analytics is feasible with modest resources. For example, country-scale epidemic analytics on a continuous basis requires only one pair of CPU cores for every 30,000 people. The second part of the proposal discusses how to add support for semantic queries to CoVault by building on secure similarity search on high-dimensional vector embeddings.

What is Emerging in AI Systems?

Daria Kim MPI for Competition and Innovation
(hosted by Krishna Gummadi)

16 Jul 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

The concept of emergence is frequently invoked in discussions on artificial intelligence (AI) to describe seemingly novel, unpredictable, and autonomous system behaviours. These narratives often anthropomorphise AI systems, attributing to them intelligence, agency, and even creativity – fuelling public fascination and intensifying normative uncertainty. However, a robust normative and legal analysis requires a precise conceptualisation of emergence that distinguishes between technical explanations, subjective projections, and philosophical perspectives. Do systems based on artificial neural networks (ANNs) acquire emergent properties? ...

The concept of emergence is frequently invoked in discussions on artificial intelligence (AI) to describe seemingly novel, unpredictable, and autonomous system behaviours. These narratives often anthropomorphise AI systems, attributing to them intelligence, agency, and even creativity – fuelling public fascination and intensifying normative uncertainty. However, a robust normative and legal analysis requires a precise conceptualisation of emergence that distinguishes between technical explanations, subjective projections, and philosophical perspectives. Do systems based on artificial neural networks (ANNs) acquire emergent properties? Drawing on complexity theory, I will argue that the predictive capacities of ANNs correspond to weak emergence, which is explained by interactions of the structural components within ANNs and the conditions of model deployment, rather than indicating ‘superimposed’, ‘autonomous’ agency. This understanding challenges agentic interpretations of AI functionalities, highlights the risks of over-attributing human-like traits to AI, and redirects attention to human causality and responsibility in AI-mediated outcomes. The discussion will further explore the implications of a reductionist understanding of AI emergence for the design of AI governance frameworks, and emphasise the role of human agency in AI-driven processes. In particular, this view can inform the ongoing debates on AI risk regulation, civil liability, the ethical boundaries of AI-driven decision-making, and reward mechanisms such as intellectual property rights. Overall, the analysis aspires to contribute to a conceptual shift – from the mythologisation of AI capabilities toward a technically grounded normative analysis.

You’re Not the Average: LLMs, Stereotypes, and the Edges of Personalization

Monojit Choudhury Mohamed bin Zayed University of Artificial Intelligence
(hosted by Krishna Gummadi)

11 Jul 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

From LLMs can imitate humans and therefore, pass the Turing test – this aint a news today. However, can they model not just a human, but you as an individual? This talk dives into the core of personalization: how well can large language models predict individual user behavior, based on their history and context? Since LLMs are trained to generate sequential data — and that data comes from humans — there's reason to believe they can mirror our choices, ...

From LLMs can imitate humans and therefore, pass the Turing test – this aint a news today. However, can they model not just a human, but you as an individual? This talk dives into the core of personalization: how well can large language models predict individual user behavior, based on their history and context? Since LLMs are trained to generate sequential data — and that data comes from humans — there's reason to believe they can mirror our choices, preferences, even our quirks. We explore this in two lights:

•Positive prediction: what users are likely to do, prefer, or know — the classic personalization setup. •Negative prediction: what users are unlikely to do or understand — the shadow side of modeling.

As we shall see, LLMs like GPT4 can model group behavior — or stereotypical patterns — fairly well, but their performance drops when it comes to smaller groups, below a certain threshold. This limitation has interesting implications not just for personalization systems, but also for questions in social science, including how we define culture, interpret collective behavior, and think about individual agency.

Internet Measurements for Security

Tiago Heinrich MPI-INF - INET

02 Jul 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Today’s Internet is responsible for connecting billions of end-points that interact by using a variety of protocols. Malicious actors can carry out attacks by exploiting users, applications, or the network. Typical attacks are two steps process: The identification of potential victims by periodically scanning the Internet for a specific service; and the attack itself where the exploitation of the target service is performed.

To protect systems on the Internet, we need to understand how vulnerable systems are being targeted, ...

Today’s Internet is responsible for connecting billions of end-points that interact by using a variety of protocols. Malicious actors can carry out attacks by exploiting users, applications, or the network. Typical attacks are two steps process: The identification of potential victims by periodically scanning the Internet for a specific service; and the attack itself where the exploitation of the target service is performed.

To protect systems on the Internet, we need to understand how vulnerable systems are being targeted, how malicious actors exploit end-points, and how attackers' behavior evolves over time. Network measurements offer valuable resources to study such behaviors. This talk highlights how Internet measurements can be used to improve security, and which factors have to be considered in this research direction.

Safety Alignment of LLMs: From code-based attacks to culture-specific safety

Animesh Mukherjee IIT Kharagpur
(hosted by Krishna Gummadi)

20 Jun 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

In this talk, I will present a series of collaborative efforts—developed with my students and colleagues at Microsoft India—on designing robust safety interventions in large language models (LLMs). Our first study, TechHazardQA (to appear at ICWSM 2025), shows how instruction-form prompts, such as code or pseudocode, can bypass content filters that typically block harmful text queries. We propose a mitigation strategy called the SafeInfer (AAAI 2025), leveraging function vectors and controlled text generation via model arithmetic to steer models away from harmful outputs. ...

In this talk, I will present a series of collaborative efforts—developed with my students and colleagues at Microsoft India—on designing robust safety interventions in large language models (LLMs). Our first study, TechHazardQA (to appear at ICWSM 2025), shows how instruction-form prompts, such as code or pseudocode, can bypass content filters that typically block harmful text queries. We propose a mitigation strategy called the SafeInfer (AAAI 2025), leveraging function vectors and controlled text generation via model arithmetic to steer models away from harmful outputs. We extend this safety strategy to multilingual settings by introducing language-specific safety vectors. Applied across several languages, this approach yields a notable reduction in attack success rates, from over 60% to below 10%. While text is one of the most dominant forms of communication on social media platforms, the internet meme culture is rapidly becoming viral. We show how vision language models (VLMs) that are routinely used as backbones for multimodal content moderation by social media platforms are often ineffective in tracking toxic memes. We tease apart the reasons for this failure using a rigorous interpretability scheme (to appear at ICWSM 2025). Presently, we are working on building intervention techniques for both toxic memes and audio podcasts. Next, I will discuss cultural sensitivities in generative models, demonstrating that the same prompt can elicit significantly different degrees of harm when framed in diverse cultural contexts. We propose a lightweight, preference-informed realignment technique that reduces culturally conditioned harmful responses across 11 regions (NAACL 2025).

Natural Language Processing for the Legal Domain: Challenges and Recent Developments

Saptarshi Ghosh Indian Institute of Technology, Kharagpur
(hosted by Krishna Gummadi)

18 Jun 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

The field of Law has become an important application domain of Natural Language Processing (NLP) due to the recent proliferation of publicly available legal data, and the socio-economic benefits of mining legal insights. Additionally, the introduction of Large Language Models (LLMs) has brought forth many applications, questions, and concerns in the legal domain. In this talk, I will discuss some of the challenges in processing of legal text, and our works in some practical research problems, ...

The field of Law has become an important application domain of Natural Language Processing (NLP) due to the recent proliferation of publicly available legal data, and the socio-economic benefits of mining legal insights. Additionally, the introduction of Large Language Models (LLMs) has brought forth many applications, questions, and concerns in the legal domain. In this talk, I will discuss some of the challenges in processing of legal text, and our works in some practical research problems, including summarization of long legal documents, and identifying relevant statutes from fact descriptions. I will also discuss our efforts towards making Law-AI more accessible in the Indian society, through developing the first datasets for several problems (available at https://github.com/Law-AI/) and the first pre-trained language model for the Indian legal domain(https://huggingface.co/law-ai/InLegalBERT) that is being widely used by both the academia and the industry, and has more than 1.5 million downloads till date.

Specifying and Fuzzing Machine-Learning Models

Hasan F. Eniser Max Planck Institute for Software Systems

13 Jun 2025, 2:00 pm - 3:00 pm

Saarbrücken building G26, room 111

SWS Student Defense Talks - Thesis Defense

Machine-Learning (ML) models are increasingly integrated into safety-critical systems, from self-driving cars to aviation, making their dependability assessment crucial. This thesis introduces novel approaches to specify and test the functional correctness of ML artifacts by adapting established software testing concepts. We first address the challenge of testing action policies in sequential decision-making problems by developing π-fuzz, a framework that uses metamorphic relations between states to identify undesirable yet avoidable outcomes. We then formalize these relations as k-safety hyperproperties and introduce NOMOS, ...

Machine-Learning (ML) models are increasingly integrated into safety-critical systems, from self-driving cars to aviation, making their dependability assessment crucial. This thesis introduces novel approaches to specify and test the functional correctness of ML artifacts by adapting established software testing concepts. We first address the challenge of testing action policies in sequential decision-making problems by developing π-fuzz, a framework that uses metamorphic relations between states to identify undesirable yet avoidable outcomes. We then formalize these relations as k-safety hyperproperties and introduce NOMOS, a domain-agnostic specification language for expressing functional correctness properties of ML models. NOMOS comes with an automated testing framework that effectively identifies bugs across diverse domains including image classification, sentiment analysis, and speech recognition. We further extend NOMOS to evaluate code translation models.

By providing these specification languages and testing frameworks, this thesis contributes essential tools for validating the reliability and safety of ML models in our increasingly machine-learning-dependent world.

System and Network Operations Through a Sociotechnical Lens: The human aspects of running digital systems

Mannat Kaur MPI-INF - INET

04 Jun 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Digital infrastructure is critical to modern society, and a great deal of work goes into ensuring the smooth and continuous operation of systems and networks. This essential labor is carried out by system and network operators. Yet, their work often remains invisible and undervalued—especially when everything appears to function as expected. System operators engage not only in a wide range of technical tasks but also in social and organizational work, such as coordinating with colleagues and helping system users. ...

Digital infrastructure is critical to modern society, and a great deal of work goes into ensuring the smooth and continuous operation of systems and networks. This essential labor is carried out by system and network operators. Yet, their work often remains invisible and undervalued—especially when everything appears to function as expected. System operators engage not only in a wide range of technical tasks but also in social and organizational work, such as coordinating with colleagues and helping system users. Their everyday practices directly shape the security posture of their organizations. However, when failures occur, system administrators are frequently blamed for misconfiguration or other types of "human error". Decades of research in human factors demonstrate that focusing on human error alone is insufficient to improve operational security. Instead, it diverts attention from the sociotechnical complexities of these environments and from supporting people in doing their work effectively. This talk will highlight the human dimensions of system operations by drawing on historical perspectives and emphasizing the sociotechnical factors essential to sustaining and securing digital infrastructure.

Computational Legal Theory

Corinna Coupette Aalto University
(hosted by Krishna Gummadi)

28 May 2025, 3:30 pm - 4:30 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

From understanding voting rules to designing auctions and allocating public goods, approaches from theoretical computer science (TCS) have proved useful for analyzing and addressing a variety of societal problems. These problems are commonly studied in political science and economics, but their practical solution depends on law – i.e., the set of rules produced by legal systems. In this talk, I ask how concepts and methods from TCS can help us understand legal systems, developing my vision for computational legal theory.

Fortifying Decentralized Financial Systems: A Perspective on Wallet Privacy and Cross-Chain Sandwiching

Christof Ferreira Torres Instituto Superior Técnico- IST
(hosted by Johnnatan Messias)

23 May 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

In recent years, modern blockchains like Ethereum have seen a surge in adoption, largely due to their ability to support decentralized finance (DeFi)— a new class of open, permissionless financial tools accessible to anyone. The security of DeFi hinges on both the robustness of smart contracts and the order in which transactions are processed, while its privacy is closely tied to protections offered by user wallets. In this talk, we’ll begin with a brief overview of the blockchain architecture, ...

In recent years, modern blockchains like Ethereum have seen a surge in adoption, largely due to their ability to support decentralized finance (DeFi)— a new class of open, permissionless financial tools accessible to anyone. The security of DeFi hinges on both the robustness of smart contracts and the order in which transactions are processed, while its privacy is closely tied to protections offered by user wallets. In this talk, we’ll begin with a brief overview of the blockchain architecture, then delve into the privacy implications of Web3 wallets and how cross-chain interoperability can give rise to exploitative financial behavior, such as cross-chain sandwich attacks. In the first part, we’ll examine how wallets can unintentionally expose user addresses to third parties and how they are increasingly being used to track users across the web. In the second part, we’ll analyze how asymmetries between Ethereum and its Layer-2 rollups create opportunities for malicious actors to exploit bridges and perform cross-chain sandwiching—a form of predatory price manipulation once thought limited to single-chain environments.

Economic Design & Behavior

Axel Ockenfels Max Planck Institute for Research on Collective Goods
(hosted by Krishna Gummadi)

21 May 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

Addressing economic and social challenges requires changes in behavior. In this talk, I will use case studies, primarily from my own research, to illustrate how human behavior and bounded rationality influences the design of institutions aimed at aligning incentives and actions with overarching goals.

Quizzes in Elementary-Level Visual Programming: Synthesis Methods and Pedagogical Utility

Ahana Ghosh Max Planck Institute for Software Systems

20 May 2025, 11:00 am - 12:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

Block-based visual programming initiatives such as Hour of Code by code.org and Intro to Programming with Karel by CodeHS.com, have transformed introductory computer science education by making programming more accessible to K-8 learners. Despite their accessibility, students often struggle with multi-step reasoning and conceptual abstraction when solving open-ended tasks. Quizzes (such as fill-in-the-gap exercises, and multiple-choice conceptual questions based on code debugging and task design) offer interactive practice and targeted feedback that can promote active learning and scaffold novice programmers. ...

Block-based visual programming initiatives such as Hour of Code by code.org and Intro to Programming with Karel by CodeHS.com, have transformed introductory computer science education by making programming more accessible to K-8 learners. Despite their accessibility, students often struggle with multi-step reasoning and conceptual abstraction when solving open-ended tasks. Quizzes (such as fill-in-the-gap exercises, and multiple-choice conceptual questions based on code debugging and task design) offer interactive practice and targeted feedback that can promote active learning and scaffold novice programmers. However, manually designing such quizzes is time-consuming and difficult to scale. This thesis tackles these challenges by developing automated synthesis techniques for programming tasks and quizzes, and evaluates their pedagogical utility.

The first part of the thesis introduces algorithmic methods for synthesizing programming tasks and quizzes in block-based environments. Specifically, we develop methods for the following : (i) synthesizing conceptually similar and yet visually dissimilar write-code tasks; (ii) synthesizing adaptive multiple-choice programming quizzes that address student-specific misconceptions; and (iii) synthesizing scaffolded subtasks that break-down complex write-code tasks into simpler subtasks. Each method leverages symbolic execution, sketch-based code mutation, and search-guided generation to ensure pedagogical utility, relevance, and technical correctness. Empirical evaluations conducted through controlled user studies demonstrate the efficacy of these approaches, showing that they not only support novice learners effectively but also outperform existing methods, including next-step code edit based feedback methods.

The second part of the thesis empirically evaluates the pedagogical utility of programming quizzes in these environments via user studies and classroom deployments with K-8 learners. Specifically, we examine: (i) the effectiveness of quiz-based feedback scaffolds with different quiz-types; (ii) the design, validation, and classification of quiz types using cognitive frameworks such as Bloom's Revised Taxonomy; and (iii) the impact of embedding quizzes within programming curricula on post-learning outcomes. Our findings show that quizzes designed using metacognitive strategies and adapted to learners’ attempts significantly enhance engagement and task performance. Moreover, we observe that richer and more diverse quiz types—when integrated into the curriculum— lead to improved post-learning outcomes, while simpler, less cognitively demanding quizzes may hinder post-learning performance.

Overall, this thesis contributes novel synthesis methods for programming quizzes and empirical evidence of their effectiveness in elementary-level programming education. These findings provide a foundation for scalable and adaptive support in elementary computing curricula.

Shaking Up the Foundations of Modern Separation Logic

Simon Spies Max Planck Institute for Software Systems

16 May 2025, 11:00 am - 12:30 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Defense

The problem of how to scalably verify large, stateful programs is one of the oldest—and still unsolved—challenges of computer science. Over the last two decades, there has been considerable progress toward this goal with the advent of separation logic, a verification technique for modularly reasoning about stateful programs. While originally only developed for imperative, pointer-manipulating programs, separation logic has in its modern form become an essential tool in the toolbox of the working semanticist for modeling programming languages and verifying programs. ...

The problem of how to scalably verify large, stateful programs is one of the oldest—and still unsolved—challenges of computer science. Over the last two decades, there has been considerable progress toward this goal with the advent of separation logic, a verification technique for modularly reasoning about stateful programs. While originally only developed for imperative, pointer-manipulating programs, separation logic has in its modern form become an essential tool in the toolbox of the working semanticist for modeling programming languages and verifying programs.

With this thesis, I present a line of work that revisits the foundations of modern separation logic in the context of the separation logic framework Iris. It targets two broader areas: step-indexing and automation. Step-indexing is a powerful technique for modeling many of the advanced, cyclic features of modern languages. Here, Transfinite Iris shows how to generalize step-indexing from proving safety properties to proving liveness properties, and Later Credits enable more flexible proof patterns for step-indexing based on separation logic resources. Automation, on the other hand, is important for reducing the overhead of verification to scale to larger code bases. Here, Quiver introduces a new form of guided specification inference to reduce the specification overhead of separation logic verification, and Daenerys develops new resources in Iris that lay the groundwork for automating parts of Iris proofs using SMT solvers.

Some recent advances in algorithmic sampling, with applications

Aditya Potukuchi York University
(hosted by Krishna Gummadi)

09 May 2025, 4:00 pm - 5:00 pm

Virtual talk

AICS Distinguished Speaker Colloquium

Efficient sampling from probabilistic models plays an important role in many fields, such as computational social science, materials science, and generative AI. Many complex multivariate distributions in these domains can be represented through simple interactions of random variables along the edges of graphs and hypergraphs. The generality and natural-ness of such models make them powerful tools for both theoretical and algorithmic applications, and as a result we have seen significant recent progress in efficient sampling techniques. ...

Efficient sampling from probabilistic models plays an important role in many fields, such as computational social science, materials science, and generative AI. Many complex multivariate distributions in these domains can be represented through simple interactions of random variables along the edges of graphs and hypergraphs. The generality and natural-ness of such models make them powerful tools for both theoretical and algorithmic applications, and as a result we have seen significant recent progress in efficient sampling techniques. In this talk, I will present a relatively recent framework for approximately sampling from these distributions. At a high level, this approach views complex distributions as sparse perturbations of a family of simpler base distributions. This perspective enables the use of powerful enumerative and analytical tools and motivates new techniques. I will also talk about some recent progress in our understanding of these distributions through the lens of our methods. In addition to providing algorithmic insight, this work work also reveals surprising probabilistic applications of studying these distributions. Finally, I will conclude with some exciting current and potential future directions.

Migration and culture in algorithmically-mediated societies

Carolina Coimbra Vieira Max Planck Institute for Software Systems

29 Apr 2025, 1:00 pm - 2:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

This collection of studies leverages digital trace data to explore human migration, cultural similarity across countries, and online behavior in algorithmically mediated platforms. By analyzing platforms such as Facebook, Wikipedia, and TikTok, the research uncovers patterns and connections often overlooked or delayed by traditional methods. In the context of culture and migration, two studies leverage Facebook data to measure cultural similarity between countries and demonstrate the influence of cultural similarity on migration flows. Another study highlights the innovative use of Wikipedia data as a tool to track mass migration flows, ...

This collection of studies leverages digital trace data to explore human migration, cultural similarity across countries, and online behavior in algorithmically mediated platforms. By analyzing platforms such as Facebook, Wikipedia, and TikTok, the research uncovers patterns and connections often overlooked or delayed by traditional methods. In the context of culture and migration, two studies leverage Facebook data to measure cultural similarity between countries and demonstrate the influence of cultural similarity on migration flows. Another study highlights the innovative use of Wikipedia data as a tool to track mass migration flows, offering real-time insights into population movements, particularly during crises like conflicts and wars. These studies primarily rely on the use of Application Programming Interface (API) for data access. However, as social media platforms increasingly restrict access to their data, alternative approaches like data donation, facilitated by GDPR, offer new opportunities for data collection. The final study leverages TikTok user-donated data to evaluate the predictability of user engagement with short-form videos, highlighting the potential of data donation for research. Together, these studies showcase the potential of digital trace data in advancing knowledge across social science and computational fields.

Why CS must care about the law

Konrad Kollnig Maastricht University
(hosted by Krishna Gummadi)

25 Apr 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 607

AICS Distinguished Speaker Colloquium

Ample research has demonstrated that compliance with data protection principles remains limited on the web and mobile. For example, very few apps on the Google Play Store fulfil the minimum requirements regarding consent under EU law, while most of them share data with companies like Google and Meta, and would likely need to seek consent from their users. Given the current mismatch between the law on the books and data practices in reality, iterative changes to existing legal practice will not be sufficient to meaningfully curb egregious data practices. ...

Ample research has demonstrated that compliance with data protection principles remains limited on the web and mobile. For example, very few apps on the Google Play Store fulfil the minimum requirements regarding consent under EU law, while most of them share data with companies like Google and Meta, and would likely need to seek consent from their users. Given the current mismatch between the law on the books and data practices in reality, iterative changes to existing legal practice will not be sufficient to meaningfully curb egregious data practices. Hence, this talk discusses a range of suggestions for academia, regulators, and the interested public to move beyond the status quo. The talk also explores new mechanisms to hold online platforms to account, in particular the 2022 EU Digital Services Act, which obliges platforms to provide researchers with relevant data for their work.

QUIC: A New Fundamental Network Protocol

Johannes Zirngibl MPI-INF - INET

02 Apr 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

QUIC is a UDP-based multiplexed and secure transport protocol that was standardized in 2021 by the IETF. It seeks to replace the traditional TCP/TLS stack by combining functionality from different layers of the ISO/OSI model. Therefore, it reduces overhead and introduces new functionality to better support application protocols, e.g., streams and datagrams. QUIC is the foundation for HTTP/3 and new proxy technologies (MASQUE). It is used for video streaming and considered for other media services.

This talk will introduce the protocol and motivate its relevance. ...

QUIC is a UDP-based multiplexed and secure transport protocol that was standardized in 2021 by the IETF. It seeks to replace the traditional TCP/TLS stack by combining functionality from different layers of the ISO/OSI model. Therefore, it reduces overhead and introduces new functionality to better support application protocols, e.g., streams and datagrams. QUIC is the foundation for HTTP/3 and new proxy technologies (MASQUE). It is used for video streaming and considered for other media services.

This talk will introduce the protocol and motivate its relevance. In the second part, I will provide insights into existing implementations and their performance. Our research shows that QUIC performance varies widely between client and server implementations from 90 Mbit/s to over 6000 Mbit/s. In the second part, I provide an overview about QUIC deployments on the Internet. At least one deployment for 18 different libraries can actually be found on the Internet.

The complexity of the protocol, the diversity of libraries and their usage on the Internet makes QUIC an important research subject.

Designing Fair Decision-Making Systems

Junaid Ali Max Planck Institute for Software Systems

25 Mar 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Defense

The impact of algorithmic decision-making systems on individuals has raised significant interest in addressing fairness concerns within such systems. Designing fair systems entails several critical components, which have garnered considerable attention from the research community. However, notable gaps persist in three key components. Specifically, in this thesis, we address gaps in following components: i) evaluating existing approaches and systems for (un)fairness, ii) updating deployed algorithmic systems fairly, and iii) designing new decision-making systems from scratch. Firstly, ...

The impact of algorithmic decision-making systems on individuals has raised significant interest in addressing fairness concerns within such systems. Designing fair systems entails several critical components, which have garnered considerable attention from the research community. However, notable gaps persist in three key components. Specifically, in this thesis, we address gaps in following components: i) evaluating existing approaches and systems for (un)fairness, ii) updating deployed algorithmic systems fairly, and iii) designing new decision-making systems from scratch. Firstly, we evaluate fairness concerns within foundation models. The primary challenge is that fairness definitions are task-specific while foundation models can be used for diverse tasks. To address this problem, we introduce a broad taxonomy to evaluate the fairness of popular foundation models and their popular bias mitigation approaches. Secondly, we tackle the issue of fairly updating already deployed algorithmic decision-making systems. To this end, we propose a novel notion of update-fairness and present measures and efficient mechanisms to incorporate this notion in binary classification. However, in cases where there is no deployed system or updating an existing system is prohibitively complex, we must design new fair decision-making systems from scratch. Lastly, we develop new fair decision-making systems for three key application scenarios. Major challenges in designing these systems include computational complexity, lack of existing approaches to tackle fairness issues and designing human-subject based studies. We develop a computationally efficient mechanism for fair influence maximization to make the spread of information in social graphs fair. Additionally, we address fairness concerns under model uncertainty, i.e., uncertainty arising due lack of data or the knowledge about the best model. We propose a novel approach for training nondiscriminatory systems that differentiate errors based on their uncertainty origin and provide efficient methods to identify and equalize errors occurring due to model uncertainty in binary classification. Furthermore, we investigate whether algorithmic decision-aids can mitigate inconsistency among human decision-makers through a large-scale study testing novel ways to provide machine advice.

Reward Design for Reinforcement Learning Agents.

Rati Devidze Max Planck Institute for Software Systems

20 Mar 2025, 11:30 pm - 21 Mar 2025, 12:30 am

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Defense

Reward functions are central in reinforcement learning (RL), guiding agents towards optimal decision-making. The complexity of RL tasks requires meticulously designed reward functions that effectively drive learning while avoiding unintended consequences. Effective reward design aims to provide signals that accelerate the agent’s convergence to optimal behavior. Crafting rewards that align with task objectives, foster desired behaviors, and prevent undesirable actions is inherently challenging. This thesis delves into the critical role of reward signals in RL, highlighting their impact on the agent’s behavior and learning dynamics and addressing challenges such as delayed, ...

Reward functions are central in reinforcement learning (RL), guiding agents towards optimal decision-making. The complexity of RL tasks requires meticulously designed reward functions that effectively drive learning while avoiding unintended consequences. Effective reward design aims to provide signals that accelerate the agent’s convergence to optimal behavior. Crafting rewards that align with task objectives, foster desired behaviors, and prevent undesirable actions is inherently challenging. This thesis delves into the critical role of reward signals in RL, highlighting their impact on the agent’s behavior and learning dynamics and addressing challenges such as delayed, ambiguous, or intricate rewards. In this thesis work, we tackle different aspects of reward shaping. First, we address the problem of designing informative and interpretable reward signals from a teacher’s/expert’s perspective (teacher-driven). Here, the expert, equipped with the optimal policy and the corresponding value function, designs reward signals that expedite the agent’s convergence to optimal behavior. Second, we build on this teacher-driven approach by introducing a novel method for adaptive interpretable reward design. In this scenario, the expert tailors the rewards based on the learner’s current policy, ensuring alignment and optimal progression. Third, we propose a meta-learning approach, enabling the agent to self-design its reward signals online without expert input (agent-driven). This self-driven method considers the agent’s learning and exploration to establish a self-improving feedback loop

FutureU -- A smartphone and virtual reality intervention to increase future orientation

Jean-Louis van Gelder Max Planck Institute for Study of Crime, Security and Law
(hosted by Krishna Gummadi)

19 Mar 2025, 12:15 pm - 1:15 pm

Kaiserslautern building G26, room 111

AICS Distinguished Speaker Colloquium

People differ in their ability and willingness to think about the future and consider the longer-term consequences of their behavior. Short-term thinking is associated with a range of self-defeating behaviors, such as delinquency, gambling, alcohol and drug abuse, as well as internalizing problems, such as depressive symptoms. In contrast, people who consider the longer-term consequences of their decisions tend to report more positive outcomes, such as feeling more competent, better health, and enhanced educational achievement. One way to explain individual differences in thinking about the future is rooted in so-called multiple self models, ...

People differ in their ability and willingness to think about the future and consider the longer-term consequences of their behavior. Short-term thinking is associated with a range of self-defeating behaviors, such as delinquency, gambling, alcohol and drug abuse, as well as internalizing problems, such as depressive symptoms. In contrast, people who consider the longer-term consequences of their decisions tend to report more positive outcomes, such as feeling more competent, better health, and enhanced educational achievement. One way to explain individual differences in thinking about the future is rooted in so-called multiple self models, which distinguish between a present and a future self. In this talk I will elaborate on FutureU, a research program that aims to stimulate future-oriented thinking, increase goal achievement, and reduce self-defeating behavior, by strengthening people’s identification with their future self. The intervention is delivered through a smartphone application (app) or immersive Virtual Reality (VR). I will discuss the theoretical background of FutureU, present some empirical results, and identify areas where it can be further improved.

Abstractions for Managing Complexity in the Design, Implementation, and Optimization of Cloud Systems

Vaastav Anand Max Planck Institute for Software Systems

13 Mar 2025, 5:00 pm - 6:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Proposal

Cloud systems are composed of multiple inter-connected independent systems. These systems are complex in nature as they are made of heterogeneous components rife with complicated interactions, operate in dynamic conditions, and exhibit unpredictable behaviors. Despite all the complexity, developers of these systems are tasked to efficiently design, implement, optimize, operate, and improve these systems in a continuous fashion. A proposed way of managing the complexity for designing, implementing, and optimizing these systems is to automate these tasks. ...

Cloud systems are composed of multiple inter-connected independent systems. These systems are complex in nature as they are made of heterogeneous components rife with complicated interactions, operate in dynamic conditions, and exhibit unpredictable behaviors. Despite all the complexity, developers of these systems are tasked to efficiently design, implement, optimize, operate, and improve these systems in a continuous fashion. A proposed way of managing the complexity for designing, implementing, and optimizing these systems is to automate these tasks. There are three major roadblocks preventing this automation from becoming reality - (i) lack of abstractions for design and implementation and design exploration of cloud systems; (ii) lack of abstractions and tooling converting user's high level design intent into actual implementations; (iii) lack of abstractions for leveraging runtime information for optimizing cloud systems. I propose new abstractions for cloud systems, with a special focus on microservice systems, for automating developer tasks. The work I will present takes us one step closer to the vision of automating the design, implementation, and optimization of cloud systems whilst managing the inherent complexity of these systems.

Improving Trustworthiness in Foundation Models: Assessing, Mitigating, and Analyzing ML Risks

Chulin Xie University of Illinois Urbana-Champaign
(hosted by Jana Hofmann)

12 Mar 2025, 10:00 am - 11:00 am

Virtual talk

CIS@MPG Colloquium

As machine learning (ML) models continue to scale in size and capability, they expand the surface area for safety and privacy risks, raising concerns about model trustworthiness and responsible data use. My research uncovers and mitigates these risks. In this presentation, I will focus on the three cornerstones of trustworthy foundation models and agents: safety, privacy, and generalization. For safety, I will introduce our comprehensive benchmarks designed to evaluate trustworthiness risks in Large Language Models (LLMs) and LLM-based code agents. ...

As machine learning (ML) models continue to scale in size and capability, they expand the surface area for safety and privacy risks, raising concerns about model trustworthiness and responsible data use. My research uncovers and mitigates these risks. In this presentation, I will focus on the three cornerstones of trustworthy foundation models and agents: safety, privacy, and generalization. For safety, I will introduce our comprehensive benchmarks designed to evaluate trustworthiness risks in Large Language Models (LLMs) and LLM-based code agents. For privacy, I will present a solution for protecting data privacy with a synthetic text generation algorithm under differential privacy guarantees. The algorithm requires only LLMs inference API access without model training, enabling efficient safe text sharing. For generalization, I will introduce our study on the interplay between the memorization and generalization of LLMs in logical reasoning during the supervised fine-tuning (SFT) stage. Finally, I will conclude with my future research plan for assessing and improving trustworthiness in foundation model-powered ML systems.

From Predictions to Impact: Building Trustworthy Human-AI Systems for High-Stakes Decision Making

Miri Zilka University of Warwick
(hosted by Manuel Gomez Rodriguez)

11 Mar 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

CIS@MPG Colloquium

Despite widespread enthusiasm for AI from governments worldwide, ensuring that AI adoption positively impacts society remains a challenge. Focusing on applications in criminal justice and social services, this talk will examine the significant gaps between current AI capabilities and the demands of real-world high-stakes decision-making. I will demonstrate critical shortcomings of current approaches to AI transparency, fairness, and effective human oversight, and discuss my work on addressing these issues, and its impact on policy and UK public services to date. ...

Despite widespread enthusiasm for AI from governments worldwide, ensuring that AI adoption positively impacts society remains a challenge. Focusing on applications in criminal justice and social services, this talk will examine the significant gaps between current AI capabilities and the demands of real-world high-stakes decision-making. I will demonstrate critical shortcomings of current approaches to AI transparency, fairness, and effective human oversight, and discuss my work on addressing these issues, and its impact on policy and UK public services to date. Concretely, I will first show how we used statistical modelling to uncover racial bias in algorithmic risk assessment instruments used for bail and sentencing decisions. Next, I will turn to my work on human-AI interaction that combines large language model speed with human accuracy to extract information from unstructured documents with high-precision. This system has served practitioners across disciplines, and is a core component of our pioneering effort to enable researcher access to UK Crown Court transcripts. I will conclude by outlining my research vision for developing urgently needed evaluation and auditing tools for human-AI systems deployed in high-risk decision-making contexts.

Efficient and Responsible Data Privacy

Tamalika Mukherjee Purdue University
(hosted by Yixin Zou)

10 Mar 2025, 10:00 am - 11:00 am

Bochum building MPI-SP, room MB1SMMW106

CIS@MPG Colloquium

Collecting user data is crucial for advancing machine learning, social science, and government policies, but the privacy of the users whose data is being collected is a growing concern. Organizations often deal with a massive volume of user data on a regular basis — the storage and analysis of such data is computationally expensive. Thus developing algorithms that not only preserve formal privacy but also perform efficiently is a challenging and important necessity. Since preserving privacy inherently involves some data distortion which potentially sacrifices accuracy for smaller populations, ...

Collecting user data is crucial for advancing machine learning, social science, and government policies, but the privacy of the users whose data is being collected is a growing concern. Organizations often deal with a massive volume of user data on a regular basis — the storage and analysis of such data is computationally expensive. Thus developing algorithms that not only preserve formal privacy but also perform efficiently is a challenging and important necessity. Since preserving privacy inherently involves some data distortion which potentially sacrifices accuracy for smaller populations, a complementary challenge is to develop responsible privacy practices that ensure that the resulting privacy implementations are equitable. My talk will focus on Differential Privacy (DP) --- a rigorous mathematical framework that preserves the privacy of individuals in the input dataset, and explore the nuanced landscape of privacy-preserving algorithms through three interconnected perspectives: the systematic design of both time and space-efficient private algorithms, and strategic approaches to creating equitable privacy practices.

Cracking System Challenges in Optical Data Center Networks

Yiting Xia MPI-INF - RG 2

05 Mar 2025, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Optical data center networks (DCNs) are transforming cloud infrastructure, yet current architectures remain closed ecosystems tightly bound to specific optical hardware. In this talk, we unveil an innovative open framework that decouples software from hardware, empowering researchers and practitioners to freely explore and deploy diverse software solutions across multiple optical platforms. Building on this flexible foundation, we tackle three critical system challenges—time synchronization, routing, and transport protocols—to enable optical DCNs to achieve nanosecond-precision, high throughput, and ultra-low latency. ...

Optical data center networks (DCNs) are transforming cloud infrastructure, yet current architectures remain closed ecosystems tightly bound to specific optical hardware. In this talk, we unveil an innovative open framework that decouples software from hardware, empowering researchers and practitioners to freely explore and deploy diverse software solutions across multiple optical platforms. Building on this flexible foundation, we tackle three critical system challenges—time synchronization, routing, and transport protocols—to enable optical DCNs to achieve nanosecond-precision, high throughput, and ultra-low latency. This presentation highlights the fundamental design shifts brought by optical DCNs and demonstrates how our breakthrough solutions surpass traditional DCN performance, setting new standards for future cloud networks.

Illuminating Generative AI: Mapping Knowledge in Large Language Models

Abhilasha Ravichander University of Washington
(hosted by Krishna Gummadi)

04 Mar 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

CIS@MPG Colloquium

Millions of everyday users are interacting with technologies built with generative AI, such as voice assistants, search engines, and chatbots. While these AI-based systems are being increasingly integrated into modern life, they can also magnify risks, inequities, and dissatisfaction when providers deploy unreliable systems. A primary obstacle to having reliable systems is the opacity of the underlying large language models - we lack a systematic understanding of how models work, where critical vulnerabilities may arise, why they are happening, ...

Millions of everyday users are interacting with technologies built with generative AI, such as voice assistants, search engines, and chatbots. While these AI-based systems are being increasingly integrated into modern life, they can also magnify risks, inequities, and dissatisfaction when providers deploy unreliable systems. A primary obstacle to having reliable systems is the opacity of the underlying large language models - we lack a systematic understanding of how models work, where critical vulnerabilities may arise, why they are happening, and how models must be redesigned to address them. In this talk, I will first describe my work in investigating large language models to illuminate when and how they acquire knowledge and capabilities. Then, I will describe my work on building methods to enable greater data transparency for large language models, that allows stakeholders to make sense of the information available to models. Finally, I will describe my work on understanding how this information can get distorted in large language models, and implications for building the next generation of robust AI systems.

Building the Tools to Program a Quantum Computer

Chenhui Yuan MIT CSAIL
(hosted by Catalin Hritcu)

24 Feb 2025, 10:00 pm - 11:00 pm

Bochum building MPI-SP, room MB1SMMW106

CIS@MPG Colloquium

Bringing the promise of quantum computation into reality requires not only building a quantum computer but also correctly programming it to run a quantum algorithm. To obtain asymptotic advantage over classical algorithms, quantum algorithms rely on the ability of data in quantum superposition to exhibit phenomena such as interference and entanglement. In turn, an implementation of the algorithm as a program must correctly orchestrate these phenomena in the states of qubits. Otherwise, the algorithm would yield incorrect outputs or lose its computational advantage. ...

Bringing the promise of quantum computation into reality requires not only building a quantum computer but also correctly programming it to run a quantum algorithm. To obtain asymptotic advantage over classical algorithms, quantum algorithms rely on the ability of data in quantum superposition to exhibit phenomena such as interference and entanglement. In turn, an implementation of the algorithm as a program must correctly orchestrate these phenomena in the states of qubits. Otherwise, the algorithm would yield incorrect outputs or lose its computational advantage. Given a quantum algorithm, what are the challenges and costs to realizing it as a program that can run on a physical quantum computer? In this talk, I answer this question by showing how basic programming abstractions upon which many quantum algorithms rely – such as data structures and control flow – can fail to work correctly or efficiently on a quantum computer. I then show how we can leverage insights from programming languages to re-invent the software stack of abstractions, libraries, and compilers to meet the demands of quantum algorithms. This approach holds out a promise of expressive and efficient tools to program a quantum computer and practically realize its computational advantage.

On Fairness, Invariance and Memorization in Machine Decision and Deep Learning Algorithms

Till Speicher Max Planck Institute for Software Systems

24 Feb 2025, 3:00 pm - 4:00 pm

Saarbrücken building E1 5, room 029

SWS Student Defense Talks - Thesis Defense

As learning algorithms become more capable, they are used to tackle an increasingly large spectrum of tasks. Their applications range from understanding images, speech and natural language to making socially impactful decisions, such as about people's eligibility for loans and jobs. Therefore, it is important to better understand both the consequences of algorithmic decisions and the mechanisms by which algorithms arrive at their outputs. Of particular interest in this regard are fairness when algorithmic decisions impact people's lives and the behavior of deep learning algorithms, ...

As learning algorithms become more capable, they are used to tackle an increasingly large spectrum of tasks. Their applications range from understanding images, speech and natural language to making socially impactful decisions, such as about people's eligibility for loans and jobs. Therefore, it is important to better understand both the consequences of algorithmic decisions and the mechanisms by which algorithms arrive at their outputs. Of particular interest in this regard are fairness when algorithmic decisions impact people's lives and the behavior of deep learning algorithms, the most powerful but also opaque type of learning algorithm. To this end, this thesis makes two contributions: First, we study fairness in algorithmic decision-making. At a conceptual level, we introduce a metric for measuring unfairness in algorithmic decisions based on inequality indices from the economics literature. We show that this metric can be used to decompose the overall unfairness for a given set of users into between- and within-subgroup components and highlight potential tradeoffs between them, as well as between fairness and accuracy. At an empirical level, we demonstrate the necessity for studying fairness in algorithmically controlled systems by exposing the potential for discrimination that is enabled by Facebook's advertising platform. In this context, we demonstrate how advertisers can target ads to exclude users belonging to protected sensitive groups, a practice that is illegal in domains such as housing, employment and finance, and highlight the necessity for better mitigation methods.

The second contribution of this thesis is aimed at better understanding the mechanisms governing the behavior of deep learning algorithms. First, we study the role that invariance plays in learning useful representations. We show that the set of invariances possessed by representations is of critical importance in determining whether they are useful for downstream tasks, more important than many other factors commonly considered to determine transfer performance. Second, we investigate memorization in large language models, which have recently become very popular. By training models to memorize random strings, we uncover a rich and surprising set of dynamics during the memorization process. We find that models undergo two phases during memorization, that strings with lower entropy are harder to memorize, that the memorization dynamics evolve during repeated memorization and that models can recall tokens in random strings with only a very restricted amount of information.

From mechanisms to cognition in neural networks

Erin Grant University College London
(hosted by Mariya Toneva)

20 Feb 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

CIS@MPG Colloquium

Neural networks optimized with generic learning objectives acquire representations that support remarkable behavioural flexibility—from learning from few examples to analogical reasoning—previously seen as uniquely human. While these artificial learning systems simulate how cognitive capacities can emerge through experience, these capacities arise from complex interactions between architecture, learning algorithm, and training data that we struggle to interpret and validate, limiting the value of neural networks as scientific models of cognition. My research addresses this epistemic challenge by connecting high-level computational properties of neural systems to their low-level mechanistic details, ...

Neural networks optimized with generic learning objectives acquire representations that support remarkable behavioural flexibility—from learning from few examples to analogical reasoning—previously seen as uniquely human. While these artificial learning systems simulate how cognitive capacities can emerge through experience, these capacities arise from complex interactions between architecture, learning algorithm, and training data that we struggle to interpret and validate, limiting the value of neural networks as scientific models of cognition. My research addresses this epistemic challenge by connecting high-level computational properties of neural systems to their low-level mechanistic details, making these systems more interpretable and manipulable for science and practice alike. I will present two case studies demonstrating this approach: how meta-learning in neural networks can be reinterpreted through the lens of hierarchical Bayesian inference, and how sparse representations can emerge naturally through the dynamics of learning in neural networks. Through these examples, I'll illustrate how interpreting and analyzing neural networks sheds light on their emergent computational properties, laying the groundwork for a more productive account of how cognitive capacities arise in neural systems.

Computational Representations for User Interfaces

Yue Jiang Aalto University
(hosted by Adish Singla)

18 Feb 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

CIS@MPG Colloquium

Traditional "one-size-fits-all" user interfaces (UIs) often fail to provide the necessary adaptability, leading to challenges in accommodating varying contexts and enhancing user capabilities. This talk explores my research on developing intelligent UIs that bridge the gap between static design and dynamic user engagement. My research focuses on facilitating the creation of intelligent UIs that support two key areas: assisting designers in building adaptive systems and capturing user behaviors to enable automatic interface adaptation. In this talk, ...

Traditional "one-size-fits-all" user interfaces (UIs) often fail to provide the necessary adaptability, leading to challenges in accommodating varying contexts and enhancing user capabilities. This talk explores my research on developing intelligent UIs that bridge the gap between static design and dynamic user engagement. My research focuses on facilitating the creation of intelligent UIs that support two key areas: assisting designers in building adaptive systems and capturing user behaviors to enable automatic interface adaptation. In this talk, I will focus on how to develop computational representations that embed domain-specific knowledge into AI models, providing intelligent suggestions while ensuring that designers maintain control over the design process. Additionally, I will discuss how to develop neural models that simulate and predict user behaviors, such as eye tracking on UIs, aligning interactions with users' unique abilities and preferences.

Foundation models of human cognition

Marcel Binz University of Marburg
(hosted by Yixin Zou)

13 Feb 2025, 10:00 am - 11:00 am

Bochum building MPI-SP, room MB1SMMW106

CIS@MPG Colloquium

Most cognitive models are domain-specific, meaning that their scope is restricted to a single type of problem. The human mind, on the other hand, does not work like this -- it is a unified system whose processes are deeply intertwined. In this talk, I will present my ongoing work on foundation models of human cognition: models that cannot only predict behavior in a single domain but that instead offer a truly universal take on our mind. ...

Most cognitive models are domain-specific, meaning that their scope is restricted to a single type of problem. The human mind, on the other hand, does not work like this -- it is a unified system whose processes are deeply intertwined. In this talk, I will present my ongoing work on foundation models of human cognition: models that cannot only predict behavior in a single domain but that instead offer a truly universal take on our mind. Furthermore, I outline my vision for how to use such behaviorally predictive models to advance our understanding of human cognition, as well as how they can be scaled to naturalistic environments.

Measuring and Improving Fairness and Resilience Across the Blockchain Ecosystem

Lioba Heimbach ETH Zurich
(hosted by Krishna Gummadi)

12 Feb 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

CIS@MPG Colloquium

The multi-layer blockchain architecture presents unique challenges, as issues in one layer can amplify or cause problems in another. These layers include the network layer, a peer-to-peer (P2P) network responsible for information dissemination; the consensus layer, where nodes reach agreement on the blockchain’s state; and the application layer, which hosts decentralized finance (DeFi) applications. My research investigates the interactions between these layers and the resulting challenges, with the goal of enhancing fairness and resilience in the blockchain ecosystem. ...

The multi-layer blockchain architecture presents unique challenges, as issues in one layer can amplify or cause problems in another. These layers include the network layer, a peer-to-peer (P2P) network responsible for information dissemination; the consensus layer, where nodes reach agreement on the blockchain’s state; and the application layer, which hosts decentralized finance (DeFi) applications. My research investigates the interactions between these layers and the resulting challenges, with the goal of enhancing fairness and resilience in the blockchain ecosystem. In the first part of this talk, I will explore how financial value originating at the application layer can threaten the consensus layer, focusing on non-atomic arbitrage — arbitrage between on-chain and off-chain exchanges. I will demonstrate how this value, despite originating at the application layer, introduces centralizing forces and security vulnerabilities in the consensus layer. In the second part, I will show how these dynamics operate in the opposite direction. Specifically, I will highlight privacy flaws in Ethereum’s P2P network that threaten the consensus layer by enabling attacks targeting application layer value. As I will demonstrate, the P2P network leaks validator locations. This vulnerability allows malicious validators (i.e., consensus layer participants) to launch targeted attacks on validators handling blocks with significant application layer value and scoop the value from those blocks.

How Do We Evaluate and Mitigate AI Risks?

Maksym Andriushchenko EPFL
(hosted by Christof Paar)

11 Feb 2025, 10:00 am - 11:00 am

Bochum building MPI-SP, room MB1SMMW106

CIS@MPG Colloquium

AI has made remarkable progress in recent years, enabling groundbreaking applications but also raising serious safety concerns. This talk will explore the robustness challenges in deep learning and large language models (LLMs), demonstrating how seemingly minor perturbations can lead to critical failures. I will present my research on evaluating and mitigating AI risks, including adversarial robustness, LLM jailbreak vulnerabilities, and the broader implications of AI safety. By developing rigorous benchmarks, novel evaluation methods, and foundational theoretical insights, ...

AI has made remarkable progress in recent years, enabling groundbreaking applications but also raising serious safety concerns. This talk will explore the robustness challenges in deep learning and large language models (LLMs), demonstrating how seemingly minor perturbations can lead to critical failures. I will present my research on evaluating and mitigating AI risks, including adversarial robustness, LLM jailbreak vulnerabilities, and the broader implications of AI safety. By developing rigorous benchmarks, novel evaluation methods, and foundational theoretical insights, my work aims to provide effective safeguards for AI deployment. Ultimately, I advocate for a systematic approach to AI risk mitigation that integrates technical solutions with real-world considerations to ensure the safe and responsible use of AI systems.

Spatio-Temporal AI for Long-term Human-centric Robot Autonomy

Lukas Schmid Massachusetts Institute of Technology
(hosted by Bernt Schiele)

10 Feb 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

CIS@MPG Colloquium

The ability to build an actionable representation of the environment of a robot is crucial for autonomy and prerequisite to a large variety of applications, ranging from home, service, and consumer robots to social, care, and medical robotics to industrial, agriculture and disaster response applications. Notably, a large part of the promise of autonomous robots depends on long-term operation in domains shared with humans and other agents. These environments are typically highly complex, semantically rich, and highly dynamic with agents frequently moving through and interacting with the scene. ...

The ability to build an actionable representation of the environment of a robot is crucial for autonomy and prerequisite to a large variety of applications, ranging from home, service, and consumer robots to social, care, and medical robotics to industrial, agriculture and disaster response applications. Notably, a large part of the promise of autonomous robots depends on long-term operation in domains shared with humans and other agents. These environments are typically highly complex, semantically rich, and highly dynamic with agents frequently moving through and interacting with the scene. This talk presents an autonomy pipeline combining perception, prediction, and planning to address these challenges. We first present methods to detect and represent complex semantics, short-term motion, and long-term changes for real-time robot perception in a unified framework called Khronos. We then show how Dynamic Scene Graphs (DSGs) can represent semantic symbols in a task-driven fashion and facilitate reasoning about the scene, such as the prediction of likely future outcomes based on the data the robot has already collected. Lastly, we show how robots as embodied agents can leverage these actionable scene representations and predictions to complete tasks such as actively gathering data that helps them improve their world models, perception, and action capabilities fully autonomously over time. The presented methods are demonstrated on-board fully autonomous aerial, legged, and wheeled robots, run in real-time on mobile hardware, and are available as open-source software.

Leveraging Sociotechnical Security and Privacy to Address Online Abuse

Miranda Wei University of Washington
(hosted by Krishna Gummadi)

06 Feb 2025, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111

CIS@MPG Colloquium

The prevalence and severity of online abuse are on the rise, from toxic content on social media to image-based sexual abuse, as new technologies are weaponized by people who do harm. Further, this abuse disproportionately harms people already marginalized in society, creating unacceptable disparities in safety and reinforcing oppression. Working in the areas of both computer security and privacy (S&P) and human-computer interaction (HCI), I address online abuse as the next frontier of S&P challenges. In this talk, ...

The prevalence and severity of online abuse are on the rise, from toxic content on social media to image-based sexual abuse, as new technologies are weaponized by people who do harm. Further, this abuse disproportionately harms people already marginalized in society, creating unacceptable disparities in safety and reinforcing oppression. Working in the areas of both computer security and privacy (S&P) and human-computer interaction (HCI), I address online abuse as the next frontier of S&P challenges. In this talk, I discuss my approach to sociotechnical threat modeling that (1) characterizes emerging S&P threats in digital safety, with particular attention to the technical and societal factors at play, (2) evaluates the existing support for online abuse, taking an ecosystem-level perspective, and (3) develops conceptual tools that bridge S&P and HCI towards societally informed S&P research. I conclude by outlining how sociotechnical security and privacy can work towards a world where all people using technology feel safe and connected.

Practical Privacy via New Systems and Abstractions

Kinan Dak Albab Brown University
(hosted by Peter Schwabe)

05 Feb 2025, 1:00 pm - 2:00 pm

Virtual talk

CIS@MPG Colloquium

Data privacy has become a focal point for public discourse. In response, Data protection and privacy regulations have been enacted across the world, including the GDPR and CCPA, and companies make various promises to end-users in their privacy policies. However, high profile privacy violations remain commonplace, in part because complying with privacy regulations and policies is challenging for applications and developers. This talk demonstrates how we can help developers achieve privacy compliance by designing new privacy-conscious systems and abstractions. ...

Data privacy has become a focal point for public discourse. In response, Data protection and privacy regulations have been enacted across the world, including the GDPR and CCPA, and companies make various promises to end-users in their privacy policies. However, high profile privacy violations remain commonplace, in part because complying with privacy regulations and policies is challenging for applications and developers. This talk demonstrates how we can help developers achieve privacy compliance by designing new privacy-conscious systems and abstractions. This talk focuses on my work on Sesame (SOSP24), my system for end-to-end compliance with privacy policies in web applications. To provide practical guarantees, Sesame combines new static analysis for data leakage with advances in memory safe languages and lightweight sandboxing, as well as standard industry practices like code review. My work in this area also includes K9db (OSDI23), a privacy-compliant database that supports compliance-by-construction with GDPR-style subject access requests. By creating privacy abstractions at the systems level, we can offer applications privacy guarantees by design, in order to simplify compliance and improve end-user privacy.

Pushing the Boundaries of Modern Application-Aware Computing Stacks

Christina Giannoula University of Toronto
(hosted by Peter Druschel)

03 Feb 2025, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 029

CIS@MPG Colloquium

Modern computing systems encounter significant challenges related to data movement in applications, such as data analytics and machine learning. Within a compute node, the physical separation of the processor from main memory necessitates retrieving data through a narrow memory bus. In big-data applications running across multiple nodes, data must be exchanged via narrow network interconnects. This movement of data —both within and across compute nodes— causes significant performance and energy overheads in modern and emerging applications. ...

Modern computing systems encounter significant challenges related to data movement in applications, such as data analytics and machine learning. Within a compute node, the physical separation of the processor from main memory necessitates retrieving data through a narrow memory bus. In big-data applications running across multiple nodes, data must be exchanged via narrow network interconnects. This movement of data —both within and across compute nodes— causes significant performance and energy overheads in modern and emerging applications. Moreover, today’s general-purpose computing stacks overlook the particular data needs of individual applications, missing crucial opportunities for untapped performance optimization. In this talk, I will present a cross-stack approach to designing application-aware computing stacks for cutting-edge applications, enabling new synergies between algorithms, systems software, and hardware. Specifically, I will demonstrate how integrating fine-grained application characteristics —such as input features, data access and synchronization patterns— across the layers of general-purpose computing stacks allows for tailoring stack components to meet the application’s specific data needs. This integration enables the stack components to work synergistically to reduce unnecessary or redundant data movement during application execution. I will present a few of my research contributions that propose hardware and software solutions for emerging applications, such as deep learning, and by capitalizing on the emerging processing-in-memory paradigm. Finally, I will conclude by outlining my future plans to design application-adaptive and sustainable computing stacks to significantly enhance performance and energy efficiency in cutting-edge applications.

Legal Research in the Era of Large Language Models

Christoph Engel Max Planck Institute for Research on Collective Goods

15 Jan 2025, 3:00 pm - 4:00 pm

Kaiserslautern building G26, room 111

SWS Colloquium

In a profound sense, the law is an applied field. It exists because society needs rules to function. Even if these rules are seemingly "bright line", in limit cases they require interpretation. Even more so if the rule in question confines itself to enunciate a normative program, and leaves its implementation to the administration and the judiciary. The traditional response is hermeneutical. The legislator translates the normative intention into words. That way, it implicitly delegates spelling out what these words mean to the parties and the authorities involved in dissolving the concrete conflicts of life. ...

In a profound sense, the law is an applied field. It exists because society needs rules to function. Even if these rules are seemingly "bright line", in limit cases they require interpretation. Even more so if the rule in question confines itself to enunciate a normative program, and leaves its implementation to the administration and the judiciary. The traditional response is hermeneutical. The legislator translates the normative intention into words. That way, it implicitly delegates spelling out what these words mean to the parties and the authorities involved in dissolving the concrete conflicts of life. This sketch of the law’s mission explains the traditional character of legal research. If a researcher adopts an "inside view", she engages in a division of labor with practicing lawyers. The quintessential product of this research is a "commentary". The researcher summarizes the state of the art thinking about a statutory provision (and maybe proposes an alternative reading). Alternatively, the researcher adopts an "outside view". In the spirit of a social scientist, she treats the law as her object of study. Typical products are more precise definitions of and empirical investigations into a class of social problems that legal rules are meant to address; or attempts at finding traces of judicial policy in the jurisprudence of a court. Large language models have the potential to deeply affect all of these strands of legal research. As the potential is more easily discernible for the "outside view", the talk will only briefly illustrate in which ways LLMs are likely to fuel this strand of legal research. It will drill deeper into the "inside view", and explain how an important part of this research, the summarization of the jurisprudence on a statutory provision (the guarantee of freedom of assembly in the German Constitution) can already today be delegated to the LLM. It is not difficult to predict that, ten years from now, legal research will look radically different.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Need help?