Events 2022

Formal Controller Synthesis for Dynamical Systems: Decidability and Scalability

Mahmoud Salamati Max Planck Institute for Software Systems
14 Dec 2022, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 113
SWS Student Defense Talks - Thesis Proposal
In cyber-physical systems research an important challenge is the synthesis of reliable controllers with respect to a general temporal specification. The synthesized controller must provide formal guarantees against different sources of uncertainty, such as measurement noise and mismatch between the dynamics of the physical system and its model, etc. By synthesizing correct-by-construction controllers for complex dynamical systems, we can enable a large number of exciting applications in various domains, including autonomous vehicle industry, energy systems and healthcare. ...
In cyber-physical systems research an important challenge is the synthesis of reliable controllers with respect to a general temporal specification. The synthesized controller must provide formal guarantees against different sources of uncertainty, such as measurement noise and mismatch between the dynamics of the physical system and its model, etc. By synthesizing correct-by-construction controllers for complex dynamical systems, we can enable a large number of exciting applications in various domains, including autonomous vehicle industry, energy systems and healthcare.

In this thesis, we plan to study controller synthesis for several different classes of dynamical systems. If it is not possible to give a complete and sound synthesis algorithm, we will try to propose a sound but scalable technique. Also, for some special classes of dynamics, we provide sound and complete decision procedures.

First, we consider continuous dynamical systems with bounded disturbance. The underlying dynamics for every continuous dynamical system can—in the bounded adversarial setting—be modeled by a (non-linear) differential inclusion system, provided that a bound over the range of model uncertainties is known. A common approach to tackle the continuous nature of the state space is to use abstraction-based controller design (ABCD) schemes. The controller designed by the ABCD scheme is described as being formal due to the guarantees on satisfaction of the specification by the original system in closed loop with the designed controller. In the first part of the thesis, we present methods to improve applicability of ABCD, by proposing (1) a data-driven scheme for relaxing the requirement of having analytical model, (2) a neural abstraction method to lower memory requirements of both synthesis and deployment, and (3) a scalable method for solving reach-avoid problems for multi-agent systems.

Second, we study continuous-time Markov decision processes (CTMDPs), which are a widely used model for continuous-time stochastic systems. A fundamental problem in the analysis of CTMDPs is time-bounded reachability, which asks whether we can synthesize a control policy with which the probability of reaching a target state within a finite horizon is greater than a given threshold. Time-bounded reachability is the core technical problem for model checking stochastic temporal logics such as Continuous Stochastic Logic, and having efficient implementations of time-bounded reachability is crucial to scaling formal analysis of CTMDPs. Existing work either considers time-abstract policies or focuses on numerical approximation. Despite the importance of this problem, its decidability is yet open. Moreover, the existing discretization-based approximation methods are not scalable for CTMDPs with a large number of states. In the second part of the thesis, we study the time-bounded reachability problem for CTMDPs, and propose (1) a conditional decidability result, and (2) a systematic method for improving scalability of numerical approximation methods. Finally, we study the class of linear dynamical systems, which are fundamental models in many different domains of science and engineering. In the third part of this thesis proposal, we consider several reachability-related problems for linear dynamical systems, and propose (1) a hardness result for point-to-point reachability of linear dynamical systems with hyper-rectangular control sets, (2) decidability of pseudo-reachability for hyperplane target sets, and (3) decidability of pseudo-reachability for semi-algebraic target sets
Read more

Let's play! - Solving controller synthesis games for cyber-physical system design

Anne Schmuck Max Planck Institute for Software Systems
07 Dec 2022, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Cyber-Physical Systems (CPS) are technical systems where a large software stack orchestrates the interaction of physical and digital components. Such systems are omnipresent in our daily life and their correct behavior is crucial. However, developing safe, reliable and performant CPS is challenging. A promising research direction towards this goal is the combination of formal methods from computer science and controller synthesis techniques from automation.

In my talk I will discuss how infinite two-player games over finite graphs, ...
Cyber-Physical Systems (CPS) are technical systems where a large software stack orchestrates the interaction of physical and digital components. Such systems are omnipresent in our daily life and their correct behavior is crucial. However, developing safe, reliable and performant CPS is challenging. A promising research direction towards this goal is the combination of formal methods from computer science and controller synthesis techniques from automation.

In my talk I will discuss how infinite two-player games over finite graphs, originating from the formal methods community, can be utilized and enhanced for higher layer control of CPS. In particular, I will discuss how the use of environment assumptions - used to model particularities of the system under control within these games - has to be rethought in order to effectively solve controller synthesis tasks for CPS.
Read more

Enforcing Stack Safety on a Capability Machine

Aïna Linn Georges Aarhus University
24 Nov 2022, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 005
simultaneous videocast to Kaiserslautern building G26, room 207 / Meeting ID: -
SWS Colloquium
Memory safety is a major source of vulnerabilities in computer systems. Capability machines are a type of CPUs that support fine-grained privilege separation using capabilities, machine words that include forms of authority. Over the last decade, CHERI, a family of capability machines, has matured into an extensive design featuring, among other, a full UNIX-style operating system, CheriBSD. Building on ideas from CHERI, capability machines are even becoming a reality in industry; the Arm Morello program is a research program led by Arm to create a prototype system on chip with capabilities. ...
Memory safety is a major source of vulnerabilities in computer systems. Capability machines are a type of CPUs that support fine-grained privilege separation using capabilities, machine words that include forms of authority. Over the last decade, CHERI, a family of capability machines, has matured into an extensive design featuring, among other, a full UNIX-style operating system, CheriBSD. Building on ideas from CHERI, capability machines are even becoming a reality in industry; the Arm Morello program is a research program led by Arm to create a prototype system on chip with capabilities. One of the promises of capability machines is that they can enforce security properties that we expect from high-level languages, in particular stack safety, even when machine code is linked with other untrusted and possibly adversarial machine code. In this talk, I will discuss what it takes to realise that promise in practice. Since stack safety properties can be quite subtle, it is crucial to formally reason about the enforcement mechanisms enabled by capabilities. This is a complex task that involves reasoning about the interaction between known code, and unknown untrusted code. We use Iris to formally reason about the deep semantic properties of capability machines.
Read more

Designing AI Systems with Steerable Long-Term Dynamics

Thorsten Joachims Cornell University
16 Nov 2022, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 002 / Meeting ID: 99457028566
SWS Distinguished Lecture Series
The feedback that users provide through their choices (e.g. clicks, purchases) is one of the most common types of data readily available for training autonomous systems, and it is widely used in online platforms. However, naively training systems based on choice data may only improve short-term engagement, but not the long-term sustainability of the platform. In this talk, I will discuss some of the pitfalls of engagement-maximization, and explore methods that allow us to supplement engagement with additional criteria that are not limited to individual action-response metrics. ...
The feedback that users provide through their choices (e.g. clicks, purchases) is one of the most common types of data readily available for training autonomous systems, and it is widely used in online platforms. However, naively training systems based on choice data may only improve short-term engagement, but not the long-term sustainability of the platform. In this talk, I will discuss some of the pitfalls of engagement-maximization, and explore methods that allow us to supplement engagement with additional criteria that are not limited to individual action-response metrics. The goal is to give platform operators a new set of macroscopic interventions for steering the dynamics of the platform, providing a new level of abstraction that goes beyond the engagement with individual recommendations or rankings.

Please contact the Office team for Zoom link information.
Read more

AI-assisted Programming: Applications, User experiences, and Neuro-symbolic techniques

Sumit Gulwani Microsoft Research
07 Nov 2022, 10:30 am - 7:30 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 99457028566
SWS Distinguished Lecture Series
AI can enhance programming experiences for a diverse set of programmers: from professional developers and data scientists (proficient programmers) who need help in software engineering and data wrangling, all the way to spreadsheet users (low-code programmers) who need help in authoring formulas, and students (novice programmers) who seek hints when stuck with their programming homework. To communicate their need to AI, users can express their intent explicitly—as input-output examples or natural-language specification—or implicitly—where they encounter a bug (and expect AI to suggest a fix), ...
AI can enhance programming experiences for a diverse set of programmers: from professional developers and data scientists (proficient programmers) who need help in software engineering and data wrangling, all the way to spreadsheet users (low-code programmers) who need help in authoring formulas, and students (novice programmers) who seek hints when stuck with their programming homework. To communicate their need to AI, users can express their intent explicitly—as input-output examples or natural-language specification—or implicitly—where they encounter a bug (and expect AI to suggest a fix), or simply allow AI to observe their last few lines of code or edits (to have it suggest the next steps).

The task of synthesizing an intended program snippet from the user’s intent is both a search and a ranking problem. Search is required to discover candidate programs that correspond to the (often ambiguous) intent, and ranking is required to pick the best program from multiple plausible alternatives. This creates a fertile playground for combining symbolic-reasoning techniques, which model the semantics of programming operators, and machine-learning techniques, which can model human preferences in programming. Recent advances in large language models like Codex offer further promise to advance such neuro-symbolic techniques.

Finally, a few critical requirements in AI-assisted programming are usability, precision, and trust; and they create opportunities for innovative user experiences and interactivity paradigms. In this talk, I will explain these concepts using some existing successes, including the Flash Fill feature in Excel, Data Connectors in PowerQuery, and IntelliCode/CoPilot in Visual Studio. I will also describe several new opportunities in AI-assisted programming, which can drive the next set of foundational neuro-symbolic advances.

This talk will be a hybrid event. You can join the meeting in E1 5 room 002 in Saarbrücken, in G26 room 111 in Kaiserlautern or via Zoom. Please contact the Office team for Zoom link information.
Read more

Why not use a hammer when a problem looks almost like a nail?

Martin Bromberger MPI-INF - RG 1
02 Nov 2022, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Automated theorem provers (ATPs) are tools that automatically prove and disprove logical formulas. They are used as back-end reasoning tools in various applications, e.g., software verification, where they prove that your program has (no) bugs; program synthesis, where they automatically construct a program that satisfies a given specification; and as brute-force tactics in various interactive theorem provers, i.e., tools that assist humans in writing and formally checking proofs of theorems. Especially in the case of software verification, ...
Automated theorem provers (ATPs) are tools that automatically prove and disprove logical formulas. They are used as back-end reasoning tools in various applications, e.g., software verification, where they prove that your program has (no) bugs; program synthesis, where they automatically construct a program that satisfies a given specification; and as brute-force tactics in various interactive theorem provers, i.e., tools that assist humans in writing and formally checking proofs of theorems. Especially in the case of software verification, ATPs have to handle formulas over a combination of domain specific logics, also called theories. These theories typically correspond to data types found in programming languages, e.g., the theories of bit vectors, integers, and arrays. Luckily for us, many of these domains are closely related to other domains and problem classes for which we already have so called hammers, i.e., efficient and well-studied algorithms and tools. However, this still means that we have to turn screws into nails, i.e., transform one problem class into a domain for which a hammer exists.

In this talk, I will present three transformations that turn screws into nails. The first two transformations are the unit cube test and the bounding reduction for the theory of linear integer arithmetic. They allow our SMT (satisfiability modulos theories) solver SPASS-SATT to efficiently handle so called unbounded problems, on which other tools typically diverge. As a result, SPASS-SATT won several awards, is deemed as one of the best SMT solvers for linear arithmetic, and several of its techniques were re-implemented in other SMT solvers.

The third transformation is called a Datalog hammer and we use it in our tool SPASS-SPL to verify so called supervisors, i.e., components of technical systems that control their systems behavior. As a result of our Datalog hammer, we were able to verify supervisor code for two examples: a supervisor for lane change assistants in a car and an electronic control unit (ECU) of a supercharged combustion engine. The latter is of special interest because ECUs were hacked by car companies to cheat at emission tests and we hope that we can efficiently find these hacks with SPASS-SPL.
Read more

Control Systems in the presence of Computational Problems

Martina Maggio Fachrichtung Informatik - Saarbrücken
05 Oct 2022, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Joint Lecture Series
Feedback control is pervasive in our lives, as it often lays the foundation of technology we daily interface with. Control systems are at the core of the energy distribution infrastructure, govern the behavior of vehicles, and are embedded into household appliances like washing machines. Controllers usually use sensors, that provide information about the physical environment, to calculate the values that actuators should assume to fulfill specific requirements. For example, adaptive cruise control systems use the throttle to automatically adjust the vehicle's speed, ...
Feedback control is pervasive in our lives, as it often lays the foundation of technology we daily interface with. Control systems are at the core of the energy distribution infrastructure, govern the behavior of vehicles, and are embedded into household appliances like washing machines. Controllers usually use sensors, that provide information about the physical environment, to calculate the values that actuators should assume to fulfill specific requirements. For example, adaptive cruise control systems use the throttle to automatically adjust the vehicle's speed, while maintaining a safe distance from vehicles ahead.

The actuator values are calculated using hardware and software. Hence, the computation of the new control signals is subject to accidental faults, systematic issues, and software bugs. However, in many cases, these computational problems can safely be ignored. This talk will introduce the problem of analyzing the behavior of control software subject to computational problems, and verifying when no corrective action is needed, i.e., when the control software is able to fulfill the system requirements despite the presence of computational problems.
Read more

Democracy and the Pursuit of Randomness

Ariel Procaccia Harvard University
22 Sep 2022, 4:00 pm - 5:00 pm
Virtual talk
Joint Lecture Series
Sortition is a storied paradigm of democracy built on the idea of choosing representatives through lotteries instead of elections. In recent years this idea has found renewed popularity in the form of citizens’ assemblies, which bring together randomly selected people from all walks of life to discuss key questions and deliver policy recommendations. A principled approach to sortition, however, must resolve the tension between two competing requirements: that the demographic composition of citizens’ assemblies reflect the general population and that every person be given a fair chance (literally) to participate. ...
Sortition is a storied paradigm of democracy built on the idea of choosing representatives through lotteries instead of elections. In recent years this idea has found renewed popularity in the form of citizens’ assemblies, which bring together randomly selected people from all walks of life to discuss key questions and deliver policy recommendations. A principled approach to sortition, however, must resolve the tension between two competing requirements: that the demographic composition of citizens’ assemblies reflect the general population and that every person be given a fair chance (literally) to participate. I will describe our work on designing, analyzing and implementing randomized participant selection algorithms that balance these two requirements. I will also discuss practical challenges in sortition based on experience with the adoption and deployment of our open-source system, Panelot.
Read more

Automated and Foundational Verification of Low-Level Programs

Michael Sammler Max Planck Institute for Software Systems
06 Sep 2022, 6:00 pm - 7:00 pm
Virtual talk
SWS Student Defense Talks - Thesis Proposal
The correctness of low-level systems software like operating systems or hypervisors is crucial to the reliability and security of modern systems. One approach for ensuring the correctness of such low-level programs is formal verification, which can show the absence of large classes of bugs. However, to apply formal verification to low-level software, we must deal with three challenges: 1) handling realistic programming languages and systems code, which is often idiomatic and nuanced, and not written with the intention of being verified, ...
The correctness of low-level systems software like operating systems or hypervisors is crucial to the reliability and security of modern systems. One approach for ensuring the correctness of such low-level programs is formal verification, which can show the absence of large classes of bugs. However, to apply formal verification to low-level software, we must deal with three challenges: 1) handling realistic programming languages and systems code, which is often idiomatic and nuanced, and not written with the intention of being verified, 2) ensuring the soundness of the verification technique, ideally via foundational proofs (i.e. proofs that are machine-checked by a theorem prover with a small trusted computing base, such as Coq), and 3) automating the verification effort as much as possible.

In my thesis, I present a verification framework for low-level systems code that advances the state of the art along these three dimensions. This framework consists of three components: The first component is RefinedC, a new Coq-based approach to verifying C code, combining foundational proofs with a high degree of automation. The second component, Islaris, complements RefinedC by scaling the verification of assembly code to realistic models of modern instruction set architectures like Armv8-A and RISC-V. The third component, DimSum, combines the two in a new "decentralized" approach to reasoning about multi-language programs. Overall, this results in a framework that can verify multi-language (C and assembly) programs with a high degree of realism, foundational proofs, and automation.
Read more

Efficient Request Isolation in Function-as-a-Service -- Reconciling Confidentiality and Correctness

Mohamed Alzayat Max Planck Institute for Software Systems
31 Aug 2022, 4:00 pm - 5:00 pm
Virtual talk
SWS Student Defense Talks - Thesis Proposal
Function-as-a-Service (FaaS) is an emerging high-level abstraction for event-driven cloud applications. The FaaS abstraction allows tenants to state their application logic in the form of stateless event-triggered functions without the need to provision or manage the underlying infrastructure that runs and scales their application. One of the core responsibilities of a FaaS provider is ensuring the confidentiality of its tenants’ data. To that end, FaaS providers isolate functions from one another, thus preventing a malicious or compromised function from affecting the confidentiality of other functions. ...
Function-as-a-Service (FaaS) is an emerging high-level abstraction for event-driven cloud applications. The FaaS abstraction allows tenants to state their application logic in the form of stateless event-triggered functions without the need to provision or manage the underlying infrastructure that runs and scales their application. One of the core responsibilities of a FaaS provider is ensuring the confidentiality of its tenants’ data. To that end, FaaS providers isolate functions from one another, thus preventing a malicious or compromised function from affecting the confidentiality of other functions. However, due to performance considerations, the same degree of isolation does not apply to sequential requests reaching the same function. This compromise can lead to severe security implications, since bugs in a function implementation — or a third-party library/runtime it depends on — may cause a leak of information from one activation of the function to a subsequent one.

Conceptually, sequential request isolation can be achieved by maintaining two invariants: a confidentiality one, where confidential data (such as end-client inputs, or credentials) should not flow from one function activation to the subsequent one, and a correctness one, where non-confidential but critical state (such as the internal state of a Pseudo Random Number Generator (PRNG)) should be maintained across sequential activations of the function to allow the intended functionality. We define the sequential request isolation problem in terms of the data flows that exist within and across invocations. As such, both invariants can be cast as data flow problems with dynamic policy updates at the request boundary. To this end, we propose two complementary approaches for achieving sequential request isolation by enforcing policies on data flows that cross the request boundary.

First, we propose Groundhog, a system that enforces the confidentiality invariant by implementing a simple fixed policy: any changes to a function’s internal state during the handling of a request are rolled back. By doing this, Groundhog confines all data flows to a single request context, thereby enforcing the confidentiality invariant "by design". While enforcing the confidentiality invariant is sufficient for many FaaS applications where the statelessness of functions and the ephemerality of execution environments are the norm, developers have no access to tools that help them verify that the correctness invariant is also met.

Next, we propose CtxTainter, an extension to standard dynamic data flow analysis (DDFA) techniques that aid developers in performing context-aware analyses. Unlike standard DDFA which can only track data flows, CtxTainter is additionally able to simultaneously reason about request-context boundaries and can thus detect both confidentiality- and correctness-critical data flows that cross a request-context boundary. While this approach can only provide best-effort confidentiality (when used standalone), it can also complement Groundhog by providing best-efforts correctness-invariant violation detection.

Groundhog is a black-box solution that is transparent to both the developer and the provider; it is programming- language agnostic, and does not require any changes to existing code of functions, libraries, language runtimes,  or OS kernels but requires provider collaboration to enable it. Groundhog incurs moderate-to-low overhead relative to an insecure baseline. In contrast, CtxTainter is an easy-to-configure development-phase tool that does not require any support or modifications from the platform provider but requires active developer participation to review and fix detected violations.
Read more

Automatically Detecting and Mitigating Issues in Program Analyzers

Numair Mansur Max Planck Institute for Software Systems
25 Jul 2022, 1:00 pm - 2:00 pm
Virtual talk
SWS Student Defense Talks - Thesis Proposal
Recent years have seen tremendous progress in the development and industrial adoption of rigorous techniques and tools to automatically detect bugs in software and ensure its correctness. Static program analysis is one such technique to automatically reason about the runtime behavior of a program without actually running it. It is a powerful approach that can be used to detect a wide range of security vulnerabilities and enables program optimization in compilers, automatic parallelization, and, if accurate enough, ...
Recent years have seen tremendous progress in the development and industrial adoption of rigorous techniques and tools to automatically detect bugs in software and ensure its correctness. Static program analysis is one such technique to automatically reason about the runtime behavior of a program without actually running it. It is a powerful approach that can be used to detect a wide range of security vulnerabilities and enables program optimization in compilers, automatic parallelization, and, if accurate enough, correctness verification. Static program analyzers are tools that implement program analysis techniques and are used to analyze other programs for flaws. Decades of research have yielded the discovery of novel algorithms, data structures, and design principles that make static program analysis tools and techniques more precise, scalable, and faster than ever before.

While the potential benefits of analyzers are obvious, their usability and effectiveness in mainstream software development workflows still often comes into question. This is because the undecidability of the halting problem makes formal reasoning about program properties difficult. Therefore, one common theme in static analysis is that to remain computable, one has to sacrifice either soundness or completeness. This means that, in practice, a static analysis technique might either miss a bug (false negative) or report correct code as having a bug (false positive), might not handle certain language features, or might only report certain types of bugs.

Practical program analyzers have to make tradeoffs (e.g., in soundness, precision, or performance) to maintain a delicate balance between returning the minimum number of false negatives/positives, scaling to very large industrial codebases, being highly automatic, and having minimum overhead for the developers. Designing, implementing, and deploying program analyzers, therefore, is an extremely challenging task. This makes them extremely complicated pieces of software with a high likelihood of having correctness bugs themselves (challenge 1) or tradeoffs not suitable for every piece of code under every usage scenario (challenge 2).

In this dissertation, we focus our attention on improving the state of the art in the above two challenge areas that can prevent typical software development teams from using these tools to their full potential. In the first part of the dissertation, we present algorithms to detect correctness bugs in fundamental program analysis components such as SMT solvers and Datalog engines. Correctness issues in program analyzers can have disastrous consequences e.g. when analyzing software used for electronic voting, financial systems, transportation, or secure communication. In the second part, we introduce techniques to reduce friction in the integration of program analyzers in everyday software development workflows. Smoothly integrating and tuning an advanced program analyzer for a particular codebase under certain resource constraints and different usage scenarios can be a challenging task for software developers, most of whom lack an advanced understanding of these analyzers.
Read more

Consecutive integers divisible by a power of their largest prime factor

Jean-Marie De Koninck Université Laval
11 Jul 2022, 2:00 pm - 3:30 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
SWS Colloquium
Given an integer n ≥ 2, let P (n) stand for its largest prime factor. Given integers k ≥ 2 and ℓ ≥ 2, consider the set Ek,ℓ of those integers n ≥ 2 for which P (n + i)ℓ | n + i for i = 0, 1, . . . , k. Each of these sets is very small. For instance, the smallest element of E3,2 is 1 294 298, the smallest known element of E3,3 has 77 digits and no elements of E4,2 are known, ...
Given an integer n ≥ 2, let P (n) stand for its largest prime factor. Given integers k ≥ 2 and ℓ ≥ 2, consider the set Ek,ℓ of those integers n ≥ 2 for which P (n + i)ℓ | n + i for i = 0, 1, . . . , k. Each of these sets is very small. For instance, the smallest element of E3,2 is 1 294 298, the smallest known element of E3,3 has 77 digits and no elements of E4,2 are known, even though all these sets are believed to be infinite. In this talk, using elementary, analytic and probabilistic approaches, we will shed some light on these sets and raise several open problems. This is joint work with Nicolas Doyon, Florian Luca and Matthieu Moineau.

---

This talk will be a hybrid event. You can join the meeting in E1 5 room 002 in Saarbrücken, in G26 room 111 in Kaiserlautern or via Zoom. Please contact the Office team for Zoom link information.
Read more

Information System Security: Beyond the Sum of its Parts Analyze, Measure, Explain, Improve

Tobias Fiebig MPI-INF - D3
06 Jul 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
Building secure and dependable information systems is one of the biggest challenges of the 21st century. With ransomware attacks paralyzing whole countries and digital infrastructures for health and essential services under threat, information system security is no longer an academic question, but a necessity.

In this talk, we take a look at those who build secure digital environments: System Administrators. These information workers face multi-facetted challenges, and understanding how they (can) build secure system stretches beyond methods of individual disciplines. ...
Building secure and dependable information systems is one of the biggest challenges of the 21st century. With ransomware attacks paralyzing whole countries and digital infrastructures for health and essential services under threat, information system security is no longer an academic question, but a necessity.

In this talk, we take a look at those who build secure digital environments: System Administrators. These information workers face multi-facetted challenges, and understanding how they (can) build secure system stretches beyond methods of individual disciplines. Secure cryptographic algorithms may be of no use if their key material is accidentally published. Software updates cannot prevent a compromise if they have never been installed. An intrusion detection system may leave defenders blind if attacks hide in a storm of false positive notifications.

Following a line along prior research, we see how we have to combine methodology from various fields to scientifically approach and improve information system security. First, we see how we can analytically assess challenges for information system security. However, to quantify their impact we then have to develop and apply network measurement techniques. While security related measurements illustrate the scale of vulnerabilities— misconfigurations for example—they lack in explanatory value as to ¬why vulnerabilities occur. Using qualitative methodology here to better understand and explain the interaction of technology, organizations, and human factors then allows us to take a practical perspective on security challenges. Ultimately, this enables us to develop and advocate for solutions that do not only provide the right technology, but also ensure that this technology can be effectively used to create secure information systems.
Read more

Expanding the Horizons of Finite-Precision Analysis

Debasmita Lohar Max Planck Institute for Software Systems
27 Jun 2022, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Thesis Proposal
Finite-precision programs inevitably introduce numerical uncertainties, which are usually a combination of input uncertainties due to noisy sensors of the underlying hardware and finite-precision errors that can potentially occur at every operation. While these errors are individually small, they propagate through an application and can make its final results meaningless. Furthermore, an implementation on actual hardware necessitates a trade-off between accuracy and efficiency. It is thus essential to verify that the numerical uncertainties remain acceptably small and to optimize the implementations such that the results are accurate enough for the applications. ...
Finite-precision programs inevitably introduce numerical uncertainties, which are usually a combination of input uncertainties due to noisy sensors of the underlying hardware and finite-precision errors that can potentially occur at every operation. While these errors are individually small, they propagate through an application and can make its final results meaningless. Furthermore, an implementation on actual hardware necessitates a trade-off between accuracy and efficiency. It is thus essential to verify that the numerical uncertainties remain acceptably small and to optimize the implementations such that the results are accurate enough for the applications.

There exist several static (or dynamic) analyses for finite-precision programs. The static analyses generally perform sound worst-case analysis and primarily focus on straight-line special syntax code, with limited support for conditionals and loops. In contrast, dynamic analyses usually scale well for real-world programs. However, the current dynamic analyses of finite-precision are limited to relatively small programs. Moreover, they do not provide any correctness guarantee that is crucial for critical applications.

In this thesis, we propose analysis and optimization techniques to expand the horizons of the current finite-precision analysis. Our approach captures the probability distributions of inputs and performs probabilistic analysis for small but exciting embedded and neural network examples. Using our analysis, it is possible to (conditionally) verify larger programs (with more than 2K lines of code, including complex programming and data structures). Furthermore, we enable program optimization by soundly generating quantized neural network classifiers with millions of parameters and fewer bits significantly faster than (generic) static fixed-point arithmetic tuners. Finally, we aim to scale the analysis for larger programs with probabilistic inputs.
Read more

Pushing the Barriers in Controller Synthesis for Cyber-Physical Systems

Kaushik Mallik Max Planck Institute for Software Systems
24 Jun 2022, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 113
SWS Student Defense Talks - Thesis Defense
Automated design of correct-by-construction controllers for continuous dynamical systems is a core algorithmic problem in the design of safety critical cyber-physical systems (CPS). A popular design paradigm constructs simpler but sound discrete abstractions of the original continuous systems, on which known reactive synthesis methods can be used to design controllers. The algorithms following this paradigm are collectively known as the Abstraction-Based Controller Design (ABCD) procedures. In this thesis, we build ABCD procedures which are faster and more modular compared to the state-of-the-art, ...
Automated design of correct-by-construction controllers for continuous dynamical systems is a core algorithmic problem in the design of safety critical cyber-physical systems (CPS). A popular design paradigm constructs simpler but sound discrete abstractions of the original continuous systems, on which known reactive synthesis methods can be used to design controllers. The algorithms following this paradigm are collectively known as the Abstraction-Based Controller Design (ABCD) procedures. In this thesis, we build ABCD procedures which are faster and more modular compared to the state-of-the-art, and can handle problems which were beyond the scope of the already existing techniques.

The thesis has three main parts: In the first part, we address the scalability bottleneck of the existing ABCD procedures by using a multi-layered abstraction of varying granularities. We implemented this multi-layered ABCD algorithm in the tool called Mascot, using which we empirically demonstrate the significant saving in computation time achieved through our approach. In the second part of the thesis, we propose a modular design procedure of sound local controllers for a network of discrete abstract systems communicating via discrete/boolean variables and having local specifications. We propose a sound algorithm, where the abstractions negotiate a pair of local assume-guarantee contracts (and controllers as a by-product), in order to synchronize on a set of non-conflicting and correct behaviors. We show the effectiveness of our algorithm using a prototype tool, called Agnes, on a set of discrete benchmark examples. In the third part, we present a novel ABCD procedure for discrete-time nonlinear stochastic control systems and omega-regular control specifications. We first present a novel 2.5-player game abstraction for the control problem at hand. Alongside our abstraction, we present a direct and efficient symbolic algorithm for solving 2.5-player games for omega-regular specifications. We implemented our algorithms in a prototype tool called Mascot-SDS, using which we empirically show that our approach is both faster and more memory-efficient than an alternate approach that was independently developed around the same time.
Read more

Planning and Specification Problems for Multi-Robot Systems, Powered by Formal Methods

Ivan Gavran Max Planck Institute for Software Systems
24 Jun 2022, 10:30 am - 11:30 am
Kaiserslautern building E1 5, room 113
SWS Student Defense Talks - Thesis Defense
This talk will be based on my PhD work on using formal methods to enhance both programming of robotic systems and learning of autonomous agents.

After giving a broad overview of the work, I will focus on the problem of inferring specifications from examples. In particular, we will discuss the problem of inferring a linear temporal logic (LTL) formula from sets of positive and negative example traces, as well as from a set of positive examples only. ...
This talk will be based on my PhD work on using formal methods to enhance both programming of robotic systems and learning of autonomous agents.

After giving a broad overview of the work, I will focus on the problem of inferring specifications from examples. In particular, we will discuss the problem of inferring a linear temporal logic (LTL) formula from sets of positive and negative example traces, as well as from a set of positive examples only.

Building on this method, I will describe LTLTalk, a tool that helps users of a robotic system transfer their intent into an LTL specification. The approach combines the intent signals from a single demonstration and a natural language description given by a user. A set of candidate specifications is inferred by encoding the problem as a satisfiability problem for propositional logic. This set is narrowed down to a single specification through interaction with the user: the user approves or declines generated simulations of the robot's behavior in different situations.
Read more

Learning from the People: Responsibly Encouraging Adoption of Contact Tracing Apps

Elissa Redmiles Max Planck Institute for Software Systems
01 Jun 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
At the beginning of the pandemic contact tracing apps proliferated as a potential solution to scaling infection tracking and response. While significant focus was put on developing privacy protocols for these apps, relatively less attention was given to understanding why, and why not, users might adopt them. Yet, for these technological solutions to benefit public health, users must be willing to adopt these apps. In this talk I showcase the value of taking a descriptive ethics approach to setting best practices in this new domain. ...
At the beginning of the pandemic contact tracing apps proliferated as a potential solution to scaling infection tracking and response. While significant focus was put on developing privacy protocols for these apps, relatively less attention was given to understanding why, and why not, users might adopt them. Yet, for these technological solutions to benefit public health, users must be willing to adopt these apps. In this talk I showcase the value of taking a descriptive ethics approach to setting best practices in this new domain. Descriptive ethics, introduced by the field of moral philosophy, determines best practices by learning directly from the user -- observing people’s preferences and inferring best practice from that behavior -- instead of exclusively relying on experts' normative decisions. This talk presents an empirically-validated framework of user's decision inputs to adopt COVID19 contact tracing apps, including app accuracy, privacy, benefits, and mobile costs. Using predictive models of users' likelihood to install COVID apps based on quantifications of these factors, I show how high the bar is for achieving adoption. I conclude by discussing a large-scale field study in which we put our survey and experimental results into practice to help the U.S. state of Louisiana advertise their COVID app and found differential results regarding the effect of privacy transparency when prosocial vs. individualist appeals were used to encourage app adoption.
Read more

The Skolem Landscape

Joël Ouaknine Max Planck Institute for Software Systems
27 May 2022, 2:00 pm - 3:00 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
SWS Colloquium
The Skolem Problem asks how to determine algorithmically whether a given linear recurrence sequence (such as the Fibonacci numbers) has a zero. It is a central question in dynamical systems and number theory, and has many connections to other branches of mathematics and computer science. Unfortunately, its decidability has been open for nearly a century! In this talk, I will present a brief survey of what is known on the Skolem Problem and related questions, including recent and ongoing developments. ...
The Skolem Problem asks how to determine algorithmically whether a given linear recurrence sequence (such as the Fibonacci numbers) has a zero. It is a central question in dynamical systems and number theory, and has many connections to other branches of mathematics and computer science. Unfortunately, its decidability has been open for nearly a century! In this talk, I will present a brief survey of what is known on the Skolem Problem and related questions, including recent and ongoing developments.

---

This talk will be a hybrid event. You can join the meeting in E1 5 room 002 in Saarbrücken, in G26 room 111 in Kaiserlautern or via Zoom. Please contact the Office team for Zoom link information.
Read more

Orderrr! A tale of money, intrigue, and specifications

Lorenzo Alvisi Cornell University
24 May 2022, 9:30 am - 10:30 am
Saarbrücken building E1 5, room 002
simultaneous videocast to building , room / Meeting ID: Zoom
SWS Distinguished Lecture Series
Mistrust over traditional financial institutions is motivating the development of decentralized financial infrastructures based on blockchains. In particular, Consortium blockchains (such as the Linux Foundation Hyperledger and Facebook’s diem) are emerging as the approach preferred by businesses. These systems allow only a well-known set of mutually distrustful parties to add blocks to the blockchain; in this way, they aim to retain the benefits of decentralization without embracing the cyberpunk philosophy that informed Nakamoto’s disruptive vision. At the core of consortium blockchains is State Machine Replication, ...
Mistrust over traditional financial institutions is motivating the development of decentralized financial infrastructures based on blockchains. In particular, Consortium blockchains (such as the Linux Foundation Hyperledger and Facebook’s diem) are emerging as the approach preferred by businesses. These systems allow only a well-known set of mutually distrustful parties to add blocks to the blockchain; in this way, they aim to retain the benefits of decentralization without embracing the cyberpunk philosophy that informed Nakamoto’s disruptive vision. At the core of consortium blockchains is State Machine Replication, a classic technique borrowed from fault tolerant distributed computing; to ensure the robustness of their infrastructure, consortium blockchains actually borrow the Byzantine-tolerant version of this technique, which guarantees that the blockchain will operate correctly even if as many as about a third of the contributing parties are bent on cheating.

But, sometimes, "a borrowing is a sorrowing".

I will discuss why Byzantine-tolerant state machine replication is fundamentally incapable of recognizing, never mind preventing, an ever present scourge of financial exchanges: the fraudulent manipulation of the order in which transactions are processed — and how its specification needs to be expanded to give it a fighting chance.

But is it possible to completely eliminate the ability of Byzantine parties to engage in order manipulation? What meaningful ordering guarantees can be enforced? And at what cost?

---

This talk will be a hybrid event. You can either attend in room 002 or via Zoom. Please contact the office team for zoom link information.
Read more

Towards Resilient and Sustainable Software Security

Thorsten Holz CISPA
04 May 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
In parallel with the ongoing digitization, computer security has become an increasingly important and urgent challenge. In particular, the sound and robust implementation of complex software systems is still not well understood in practice, as evidenced by the steady stream of successful attacks observed in the wild. The current state of the art in software security consists of solutions that are often technically sound, but do not provide operational security in practice.

In this talk, ...
In parallel with the ongoing digitization, computer security has become an increasingly important and urgent challenge. In particular, the sound and robust implementation of complex software systems is still not well understood in practice, as evidenced by the steady stream of successful attacks observed in the wild. The current state of the art in software security consists of solutions that are often technically sound, but do not provide operational security in practice.

In this talk, I will give an overview of our recent work towards resilient and sustainable software security, which is also the focus of my upcoming ERC Consolidator project. On the one hand, the system must be resilient against entire classes of attack vectors. On the other hand, the system must be sustainable, i.e., it must be able to maintain its security at least over its design lifetime and possibly even adapt over time. In the talk, I will discuss some of the methods we have developed to achieve this goal, with a specific focus on novel software testing strategies that enable accurate and efficient vulnerability discovery. I will conclude the talk with a brief overview of future research directions.
Read more

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Long Tran-Thanh University of Warwick
07 Apr 2022, 11:00 am - 12:00 pm
Virtual talk
SWS Colloquium
In this paper, we study how bandit algorithms in a bounded reward setting can be contaminated and fooled to learn wrong actions. In particular, we consider a strong attacker model in which the attacker aims to fool the learning algorithm to learn a suboptimal bandit policy. To do so, this attacker can observe both the selected actions and their corresponding rewards, and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret O(\log T) can be forced to suffer a linear regret with an expected amount of contamination O(\log T). ...
In this paper, we study how bandit algorithms in a bounded reward setting can be contaminated and fooled to learn wrong actions. In particular, we consider a strong attacker model in which the attacker aims to fool the learning algorithm to learn a suboptimal bandit policy. To do so, this attacker can observe both the selected actions and their corresponding rewards, and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret O(\log T) can be forced to suffer a linear regret with an expected amount of contamination O(\log T). We show that this amount of contamination is also necessary, as we prove that there exists a no-regret bandit algorithm, specifically the classical UCB, that requires \Omega(\log T) amount of contamination to suffer linear regret. To combat such poisoning attacks, our second main contribution is to propose verification-based mechanisms, which use a verification scheme to access a limited number of uncontaminated rewards. In particular, for the case of unlimited verifications, we show that with O(\log T) expected number of verifications, a simple modified version of the Explore-then-Commit type bandit algorithm can restore the order optimal O(\log T) regret irrespective of the amount of contamination used by the attacker. We also provide a UCB-like verification scheme, called Secure-UCB, that also enjoys full recovery from any attacks, also with O(\log T) expected number of verifications. To derive a matching lower bound on the number of verifications, we also prove that for any order-optimal bandit algorithm, this number of verifications O(\log T) is necessary to recover the order-optimal regret. On the other hand, when the number of verifications is bounded above by a budget B, we propose a novel algorithm, Secure-BARBAR, which provably achieves O(\min\{C,T/sqrt{B} \}) regret with high probability against weak attackers (i.e., attackers who have to place the contamination \emph{before} seeing the actual pulls of the bandit algorithm), where C is the total amount of contamination by the attacker. This new result breaks the known \Omega(C) lower bound of the non-verified setting if C is large.  

This is joint work with Anshuka Rangi and Haifeng Xu.

Please contact crichter@mpi-sws.org for link information
Read more

Robust Deep Neural Networks for Computer Vision

Adam Kortylewski MPI-INF - D6
06 Apr 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
Deep learning sparked a tremendous increase in the performance of computer vision systems over the past decade. However, Deep Neural Networks (DNNs) are still far from reaching human-level performance at visual recognition tasks. The most important limitation of DNNs is that they fail to give reliable predictions in unseen or adverse viewing conditions, which would not fool a human observer, such as when objects are partially occluded, seen in an unusual pose or context, or in bad weather. ...
Deep learning sparked a tremendous increase in the performance of computer vision systems over the past decade. However, Deep Neural Networks (DNNs) are still far from reaching human-level performance at visual recognition tasks. The most important limitation of DNNs is that they fail to give reliable predictions in unseen or adverse viewing conditions, which would not fool a human observer, such as when objects are partially occluded, seen in an unusual pose or context, or in bad weather. This lack of robustness in DNNs is generally acknowledged, but the problem largely remains unsolved. In this talk, I will give an overview of the principles underlying my work on building robust deep neural networks for computer vision. My working hypothesis is that vision systems need a causal 3D understanding images by following an analysis-by-synthesis approach. I will discuss a new type of neural network architecture that implements such an approach, and I will show that these generative neural network models are vastly superior to traditional models in terms of robustness, learning efficiency and because they can solve many vision tasks at once. Finally, I will give a brief outlook on current projects of mine and future research directions.
Read more

Machine learning for algorithm design

Maria Florina Balcan Carnegie Mellon University
29 Mar 2022, 3:00 pm - 4:00 pm
Virtual talk
SWS Distinguished Lecture Series
The classic textbook approach to designing and analyzing algorithms for combinatorialproblems considers worst-case instances of the problem, about which the algorithm designer has no prior information. Since for many problems such worst-case guarantees are quite weak, practitioners often employ a data-driven algorithm design approach; specifically,they use machine learning and instances of the problem from their specific domain to learn a method that works well in that domain. Historically, such data-driven algorithmic techniques have come with no performance guarantees. ...
The classic textbook approach to designing and analyzing algorithms for combinatorialproblems considers worst-case instances of the problem, about which the algorithm designer has no prior information. Since for many problems such worst-case guarantees are quite weak, practitioners often employ a data-driven algorithm design approach; specifically,they use machine learning and instances of the problem from their specific domain to learn a method that works well in that domain. Historically, such data-driven algorithmic techniques have come with no performance guarantees. In this talk, I will describeour recent work on providing performance guarantees for data-driven algorithm design both in the distributional and online learning formalizations.

--

Please contact the office team for zoom link information.
Read more

Confidential and Private Collaborative Machine Learning

Adam Dziedzic The Vector Institute & University of Toronto
04 Mar 2022, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Colloquium
This talk outlines my work building systems that enable applications to securely interact with users' data while preserving individuals' privacy. First, I'll talk bout how we can bring the power of secure computation to difficult settings: TimeCrypt is an encrypted time-series database design that meets the scalability and low-latency requirements associated with time-series workloads. Then, I'll discuss work on using end-to-end privacy as a strong foundation for data protection: Zeph is a new end-to-end privacy system that provides the means to extract value from encrypted streaming data safely while ensuring data confidentiality and privacy by serving only privacy-compliant views of the data. ...
This talk outlines my work building systems that enable applications to securely interact with users' data while preserving individuals' privacy. First, I'll talk bout how we can bring the power of secure computation to difficult settings: TimeCrypt is an encrypted time-series database design that meets the scalability and low-latency requirements associated with time-series workloads. Then, I'll discuss work on using end-to-end privacy as a strong foundation for data protection: Zeph is a new end-to-end privacy system that provides the means to extract value from encrypted streaming data safely while ensuring data confidentiality and privacy by serving only privacy-compliant views of the data. Throughout the talk, I will discuss the prevalent challenges of efficiency, functionality, and accessibility in this research area; my approach to addressing these challenges; and future directions that will help bring end-to-end privacy to an even wider range of applications.

Please contact MPI-SWS Office Team for link information
Read more

Systems Designs for End-to-End Privacy

Anwar Hithnawi ETH Zurich
03 Mar 2022, 10:00 am - 11:00 am
Virtual talk
CIS@MPG Colloquium
The potential of data to transform science and society has spurred unparalleled efforts to collect it in increasingly sensitive and granular forms. This accumulation of sensitive data did not materialize without issues and has raised severe societal concerns. These concerns appear amply justified by numerous reports of data breaches and misuse. Today, we are at an inflection point: if we want to continue enjoying the benefits of data-driven services, we need to place privacy at the center of our data ecosystems. ...
The potential of data to transform science and society has spurred unparalleled efforts to collect it in increasingly sensitive and granular forms. This accumulation of sensitive data did not materialize without issues and has raised severe societal concerns. These concerns appear amply justified by numerous reports of data breaches and misuse. Today, we are at an inflection point: if we want to continue enjoying the benefits of data-driven services, we need to place privacy at the center of our data ecosystems.

This talk outlines my work building systems that enable applications to securely interact with users' data while preserving individuals' privacy. First, I'll talk bout how we can bring the power of secure computation to difficult settings: TimeCrypt is an encrypted time-series database design that meets the scalability and low-latency requirements associated with time-series workloads. Then, I'll discuss work on using end-to-end privacy as a strong foundation for data protection: Zeph is a new end-to-end privacy system that provides the means to extract value from encrypted streaming data safely while ensuring data confidentiality and privacy by serving only privacy-compliant views of the data. Throughout the talk, I will discuss the prevalent challenges of efficiency, functionality, and accessibility in this research area; my approach to addressing these challenges; and future directions that will help bring end-to-end privacy to an even wider range of applications.

---

Please contact the MPI-SWS Office Team for the ZOOM link information.
Read more

Empowering People to Have Secure and Private Interactions with Digital Technologies

Pardis Emami-Naeini University of Washington
02 Mar 2022, 5:00 pm - 6:00 pm
Virtual talk
CIS@MPG Colloquium
Digital technologies are evolving with advanced capabilities. To function, these technologies rely on collecting and processing various types of sensitive data from their users. These data practices could expose users to a wide array of security and privacy risks. My research at the intersection of security, privacy, and human-computer interaction aims to help all people have safer interactions with digital technologies. In this talk, I will share results on people’s security and privacy preferences and attitudes toward technologies such as smart devices and remote communication tools. ...
Digital technologies are evolving with advanced capabilities. To function, these technologies rely on collecting and processing various types of sensitive data from their users. These data practices could expose users to a wide array of security and privacy risks. My research at the intersection of security, privacy, and human-computer interaction aims to help all people have safer interactions with digital technologies. In this talk, I will share results on people’s security and privacy preferences and attitudes toward technologies such as smart devices and remote communication tools. I will then describe a security and privacy transparency tool that I designed and evaluated to address consumers’ needs when purchasing and interacting with smart devices. I will end my talk by discussing emerging and future directions for my research to design equitable security and privacy tools and policies by studying and designing for the needs of diverse populations.

Please contact MPI-SWS Office Team for link information
Read more

Improving People’s Adoption of Security and Privacy Behaviors

Yixin Zou University of Michigan
01 Mar 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Experts recommend a plethora of advice for staying safe online, yet people still use weak passwords, fall for scams, or ignore software updates. Such inconsistent adoption of protective behaviors is understandable given the need to navigate other priorities and constraints in everyday life. Yet when the actions taken are insufficient to mitigate potential risks, it leaves people – especially those already marginalized – vulnerable to dire consequences from financial loss to abuse and harassment.

In this talk, ...
Experts recommend a plethora of advice for staying safe online, yet people still use weak passwords, fall for scams, or ignore software updates. Such inconsistent adoption of protective behaviors is understandable given the need to navigate other priorities and constraints in everyday life. Yet when the actions taken are insufficient to mitigate potential risks, it leaves people – especially those already marginalized – vulnerable to dire consequences from financial loss to abuse and harassment.

In this talk, I share findings from my research on hurdles that prevent people from adopting secure behaviors and solutions that encourage adoption in three domains: designing data breach notifications, informing privacy interface guidelines in regulations, and supporting survivors of tech-enabled abuse. (1) Even small changes in system design can make a big difference. I empirically show consumers’ low awareness of data breaches, rational justifications and biases behind inaction, and how to motivate consumers to change breached passwords through nudges in breach notifications. (2) Public policy is essential in incentivizing companies to implement better data practices, but policymaking needs to be informed by evidence from research. I present a series of user studies that led to a user-tested icon for conveying the "do not sell my personal information" opt-out, now part of the California Consumer Privacy Act (CCPA). (3) Different user groups have different threat models and safety needs, requiring special considerations in developing and deploying interventions. Drawing on findings from focus groups, I discuss how computer security support agents can help survivors of tech-enabled abuse using a trauma-informed approach. Altogether, I highlight the impact of my research on technology design, public policy, and educational efforts. I end the talk by discussing how my interdisciplinary, human-centered approach in solving security and privacy challenges can apply to future work such as improving expert advice and developing trauma-informed computing systems.

Please contact MPI-SWS Office Team for link information
Read more

Specification-Guided Policy-Synthesis

Suguman Bansal University of Pennsylvania
28 Feb 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Policy synthesis or algorithms to design policies for computational systems is one of the fundamental problems in computer science. Standing on the shoulders of simplified yet concise task-specification using high-level logical specification languages, this talk will cover synthesis algorithms using two contrasting approaches. First, the classical logic-based approach of reactive synthesis; Second, the modern learning-based approach of reinforcement learning. This talk will cover our scalable and efficient state-of-the-art algorithms for synthesis from high-level specifications using both these approaches, ...
Policy synthesis or algorithms to design policies for computational systems is one of the fundamental problems in computer science. Standing on the shoulders of simplified yet concise task-specification using high-level logical specification languages, this talk will cover synthesis algorithms using two contrasting approaches. First, the classical logic-based approach of reactive synthesis; Second, the modern learning-based approach of reinforcement learning. This talk will cover our scalable and efficient state-of-the-art algorithms for synthesis from high-level specifications using both these approaches, and investigate whether formal guarantees are possible. We will conclude with a forward-looking view of these contributions to trustworthy AI.

Please contact MPI-SWS Office Team for link information
Read more

Grounding Language by Seeing, Hearing, and Interacting

Rowan Zellers University of Washington
23 Feb 2022, 4:00 pm - 5:00 pm
Virtual talk
CIS@MPG Colloquium
As humans, our understanding of language is grounded in a rich mental model about "how the world works" – that we learn through perception and interaction. We use this understanding to reason beyond what is literally said, imagining how situations might unfold in the world. Machines today struggle at making such connections, which limits how they can be safely used. In my talk, I will discuss three lines of work to bridge this gap between machines and humans. ...
As humans, our understanding of language is grounded in a rich mental model about "how the world works" – that we learn through perception and interaction. We use this understanding to reason beyond what is literally said, imagining how situations might unfold in the world. Machines today struggle at making such connections, which limits how they can be safely used. In my talk, I will discuss three lines of work to bridge this gap between machines and humans. I will first discuss how we might measure grounded understanding. I will introduce a suite of approaches for constructing benchmarks, using machines in the loop to filter out spurious biases. Next, I will introduce PIGLeT: a model that learns physical commonsense understanding by interacting with the world through simulation, using this knowledge to ground language. PIGLeT learns linguistic form and meaning – together – and outperforms text-to-text only models that are orders of magnitude larger. Finally, I will introduce MERLOT, which learns about situations in the world by watching millions of YouTube videos with transcribed speech. The model learns to jointly represent video, audio, and language, together and over time – learning multimodal and neural script knowledge representations. Together, these directions suggest a path forward for building machines that learn language rooted in the world.

Please contact MPI-SWS Office Team for link information
Read more

Measurement and Experimentation in Complex Sociopolitical Processes

Aaron Schein Columbia University
22 Feb 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Complex social and political processes at many scales—from interpersonal networks of friends to international networks of countries—are a central theme of computational social science. Modern methods of data science that can contend with the complexity of data from such processes have the potential to break ground on long-standing questions of critical relevance to public policy. In this talk, I will present two lines of work on 1) estimating the causal effects of friend-to-friend mobilization in US elections, ...
Complex social and political processes at many scales—from interpersonal networks of friends to international networks of countries—are a central theme of computational social science. Modern methods of data science that can contend with the complexity of data from such processes have the potential to break ground on long-standing questions of critical relevance to public policy. In this talk, I will present two lines of work on 1) estimating the causal effects of friend-to-friend mobilization in US elections, and 2) inferring complex latent structure in dyadic event data of country-to-country interactions. In the first part, I will discuss recent work using large-scale digital field experiments on the mobile app Outvote to estimate the causal effects of friend-to-friend texting on voter turnout in the 2018 and 2020 US elections. This work is among the first to rigorously assess the effectiveness of friend-to-friend "get out the vote" tactics, which political campaigns have increasingly embraced in recent elections. I will discuss the statistical challenges inherent to randomizing interactions between friends with a "light touch" design and will describe the methodology we developed to identify and precisely estimate causal effects. In the second part of this talk, I will discuss hierarchical Bayesian modeling of dyadic event data sets in international relations which contain millions of micro-records of the form "country i took action a to country j at time t". The models I will discuss blend elements of tensor decomposition and dynamical systems to capture complex temporal and network dependence structure in the data. Approximate posterior inference relies on new auxiliary variable augmentation schemes and theorems about the properties and relationships between different discrete distributions. At the end of the talk, I will outline the future of both lines of work, as well as their intersection, and sketch a broader vision for how data science can serve computational social science and vice versa.

Please contact MPI-SWS Office Team for link information
Read more

Language theory into practice, a play in three acts

Ningning Xie University of Cambridge
21 Feb 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Computer development has come a long way. Along with the evolution of computers, advances in high-level programming languages allow us to write large-scale software systems easily. While new language features significantly extend the language expressive power, they often lack theoretical development and lead to subtle implementation bugs. Moreover, while high-level languages abstract over low-level aspects and thus eliminate many sources of errors, the abstraction often comes with a runtime penalty that results in inefficient low-level code. ...
Computer development has come a long way. Along with the evolution of computers, advances in high-level programming languages allow us to write large-scale software systems easily. While new language features significantly extend the language expressive power, they often lack theoretical development and lead to subtle implementation bugs. Moreover, while high-level languages abstract over low-level aspects and thus eliminate many sources of errors, the abstraction often comes with a runtime penalty that results in inefficient low-level code. In this talk, I will show how to apply programming language theory to practical programming to offer strong static safety and efficiency guarantees in three domains: language design, runtime systems, and machine learning systems. First, I will demonstrate a type-theoretical formalization of language features, focusing on type inference for dependent types in algebraic datatype declarations. The formalization has guided real-world language implementations. Then, I will show that programming language theory reaps benefits beyond safety. I will present Perceus, a garbage-free reference counting algorithm with reuse, that supports high-level programming while preserving low-level efficiency. Perceus delivers competitive performance compared to state-of-the-art memory reclamation implementations. Finally, as part of a vision to make programming languages broadly applicable, I will discuss my efforts to apply programming language techniques to machine learning systems, by presenting a program synthesis framework that accelerates large-scale distributed machine learning on hardware platforms.

Please contact MPI-SWS Office Team for link information
Read more

Improving Robustness in Machine Learning Models

Yao Qin Google Research
18 Feb 2022, 4:00 pm - 5:00 pm
Virtual talk
CIS@MPG Colloquium
There are many robustness issues arising in a variety of forms while deploying ML systems in the real world. For example, neural networks suffer from distributional shift — a model is tested on a data distribution different from what it was trained on. In addition, neural networks are vulnerable to adversarial examples – small perturbations to the input can successfully fool classifiers into making incorrect predictions. In this talk, I will introduce how to improve robustness of machine learning models by building connections between different perspectives of robustness issues and bridging gaps between a wide range of modalities. ...
There are many robustness issues arising in a variety of forms while deploying ML systems in the real world. For example, neural networks suffer from distributional shift — a model is tested on a data distribution different from what it was trained on. In addition, neural networks are vulnerable to adversarial examples – small perturbations to the input can successfully fool classifiers into making incorrect predictions. In this talk, I will introduce how to improve robustness of machine learning models by building connections between different perspectives of robustness issues and bridging gaps between a wide range of modalities. As a result, seemingly different robustness issues can be tackled by closely-related approaches, and robust ML on multiple modalities and backbone architectures can converge to a common ground.

Please contact MPI-SWS Office Team for link information
Read more

Characterizing and Mitigating Threats to Trust and Safety Online

Yiqing Hua Cornell University
15 Feb 2022, 10:00 am - 11:00 am
Virtual talk
CIS@MPG Colloquium
Supporting a safe and trustworthy online environment is challenging, as these environments are constantly threatened by abusive behaviors that cause real human harm. Among the numerous threats, online harassment suppresses voices, and misleading information and propaganda undermine public trust. Existing methods to combat these threats are often not sufficient, as adversaries may abuse and exploit technologies in nuanced ways, and mitigation strategies don't always reflect users' needs. In this talk, I will present my work on characterizing threats, ...
Supporting a safe and trustworthy online environment is challenging, as these environments are constantly threatened by abusive behaviors that cause real human harm. Among the numerous threats, online harassment suppresses voices, and misleading information and propaganda undermine public trust. Existing methods to combat these threats are often not sufficient, as adversaries may abuse and exploit technologies in nuanced ways, and mitigation strategies don't always reflect users' needs. In this talk, I will present my work on characterizing threats, and empowering users with new techniques to combat these threats. First, I will discuss the challenges faced in detecting subtle harassment on social media, and my approach to context-specific analysis using the United States 2018 general election as a case study. Second, I will demonstrate the importance of characterizing user participation in adversarial activities, to inform better moderation mechanism design. Lastly, I will introduce my work on developing privacy-preserving abuse mitigation techniques, to allow user-side warnings of misinformation images in end-to-end encrypted environments.

Please contact MPI-SWS Office Team for link information
Read more

How do neurons learn?

Hadi Daneshmand Princeton University
11 Feb 2022, 12:00 pm - 1:00 pm
Virtual talk
CIS@MPG Colloquium
Representation learning with neural networks automates feature extraction with less need for continental feature engineering; thereby achieving incredible performance in image, text, and strategy processing. However, the underlying mechanism of representation learning is not well understood. This limits applications of representation learning in critical tasks, such as cancer diagnoses and other medical decisions. In this talk, we propose a research plan for studying representation learning with three core research focuses: Random neural networks. By studying random neural networks, ...
Representation learning with neural networks automates feature extraction with less need for continental feature engineering; thereby achieving incredible performance in image, text, and strategy processing. However, the underlying mechanism of representation learning is not well understood. This limits applications of representation learning in critical tasks, such as cancer diagnoses and other medical decisions. In this talk, we propose a research plan for studying representation learning with three core research focuses: Random neural networks. By studying random neural networks, we shed light on the inner workings of the incredible performance of modern neural networks. We demonstrate how the study of random networks allows us to go beyond the conventional trial and error development of neural networks. Local optimality. Given a neural network, is it possible to improve its performance only by slight modifications of the network parameters? This is the focus of local optimization for representation learning. Our research highlights that local optimization requires more studies in modern representation learning with generative adversarial networks. Modeling. A mathematical study of learning dynamics is very challenging. Modeling facilitates the study of learning dynamics by omitting technical details of learning. For example, a continuous-time dynamical system may model an iterative learning method, bridging the gap between dynamical systems and representation learning.

Please contact MPI-SWS Office Team for link information
Read more

Toward Reliable Machine Learning with Kernels

Krikamol Muandet MPI-IS
11 Feb 2022, 9:30 am - 10:30 am
Virtual talk
CIS@MPG Colloquium
Society is made up of a set of diverse individuals, demographic groups, and institutions. Therefore, learning and deploying algorithmic models across heterogeneous environments face a set of various trade-offs. In order to develop reliable machine learning algorithms that can interact successfully with the real world, it is necessary to deal with changes in underlying data-generating distributions. This talk will be about the kernel mean embedding (KME), a nonparametric kernel-based framework to represent probability distributions and model changes thereof. ...
Society is made up of a set of diverse individuals, demographic groups, and institutions. Therefore, learning and deploying algorithmic models across heterogeneous environments face a set of various trade-offs. In order to develop reliable machine learning algorithms that can interact successfully with the real world, it is necessary to deal with changes in underlying data-generating distributions. This talk will be about the kernel mean embedding (KME), a nonparametric kernel-based framework to represent probability distributions and model changes thereof. In particular, I will focus on how this framework can help improve the credibility of algorithmic decision-making by enabling us to reason about higher-order causal effects of policy interventions as well as by removing the effect of unobserved confounders through the use of an instrumental variable (IV). Lastly, I will argue that a better understanding of the ways in which our data are generated and how our models can influence them will be crucial for reliable machine learning systems, especially when gaining full information about data may not be possible.

Please contact MPI-SWS Office Team for link information
Read more

Strengthening and Enriching Machine Learning for Cybersecurity

Wenbo Guo Penn State University
10 Feb 2022, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Colloquium
Nowadays, security researchers are increasingly using AI to automate and facilitate security analysis. Although making some meaningful progress, AI has not maximized its capability in security yet, mainly due to two challenges. First, existing ML techniques have not reached security professionals' requirements in critical properties, such as interpretability and adversary-resistancy. Second, Security data imposes many new technical challenges, and these challenges break the assumptions of existing ML models and thus jeopardize their efficacy. In this talk, ...
Nowadays, security researchers are increasingly using AI to automate and facilitate security analysis. Although making some meaningful progress, AI has not maximized its capability in security yet, mainly due to two challenges. First, existing ML techniques have not reached security professionals' requirements in critical properties, such as interpretability and adversary-resistancy. Second, Security data imposes many new technical challenges, and these challenges break the assumptions of existing ML models and thus jeopardize their efficacy. In this talk, I will describe my research efforts to address the above challenges, with a primary focus on strengthening the interpretability of ML-based security systems and enriching ML to detect concept drifts in security data. Regarding interpretability, I will describe our explanation methods for deep learning-based and deep reinforcement learning-based security systems and demonstrate how security analysts could benefit from these methods to establish trust in blackbox models and patching model vulnerabilities. As for concept drifts, I will introduce a novel ML system to detect and explain drifting samples and demonstrate its application in a real-world malware database. Finally, I will conclude by highlighting my future plan towards maximizing the capability of advanced ML in cybersecurity.

Please contact MPI-SWS Office Team for link information
Read more

Robust and Accountable Multi-Agent Sequential Decision Making

Goran Radanovic Max Planck Institute for Software Systems
02 Feb 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
The recent success stories of reinforcement learning in game playing have demonstrated the utility of the reinforcement learning framework in deriving scalable solutions to multi-agent sequential decision making problems. However, applying these solutions beyond simulated environments requires additional building blocks that would enable trustworthy decision making. In this talk, I will present some of our recent results that relate to robustness and accountability---two properties that any decision making system ought to satisfy in order to be deemed trustworthy. ...
The recent success stories of reinforcement learning in game playing have demonstrated the utility of the reinforcement learning framework in deriving scalable solutions to multi-agent sequential decision making problems. However, applying these solutions beyond simulated environments requires additional building blocks that would enable trustworthy decision making. In this talk, I will present some of our recent results that relate to robustness and accountability---two properties that any decision making system ought to satisfy in order to be deemed trustworthy. These results showcase some of the challenges in designing agents and support systems for robust and accountable multi-agent sequential decision making.
Read more

Software for Fast Storage Hardware

Willy Zwaenepoel University of Sydney
20 Jan 2022, 10:00 am - 11:00 am
Virtual talk
SWS Distinguished Lecture Series
Storage technologies are entering the market with performance vastly superior to conventional storage devices. This technology shift requires a complete rethinking of the software storage stack.

In this talk I will give two examples of our work with Optane-based solid-state (block) devices that illustrate the need for and the benefit of a wholesale redesign.

First, I will describe the Kvell key-value (KV) store. The key observation underlying Kvell is that conventional KV software on fast devices is bottlenecked by the CPU rather than by the device. ...
Storage technologies are entering the market with performance vastly superior to conventional storage devices. This technology shift requires a complete rethinking of the software storage stack.

In this talk I will give two examples of our work with Optane-based solid-state (block) devices that illustrate the need for and the benefit of a wholesale redesign.

First, I will describe the Kvell key-value (KV) store. The key observation underlying Kvell is that conventional KV software on fast devices is bottlenecked by the CPU rather than by the device. Kvell therefore focuses on minimizing CPU intervention.

Second, I will describe the Kvell+ OLTP/OLAP system built on top of Kvell. The key underlying observation here is that these storage devices have become so fast that the conventional implementation of snapshot isolation – maintaining multiple versions – quickly leads to the device filling up. Kvell therefore focuses processes new versions as they are created.

This talk describes joint work with Oana Balmau (McGill University), Karan Gupta (Nutanix) and Baptiste Lepers (University of Sydney).

---

Please contact the MPI-SWS Office Team for the ZOOM link information. .
Read more

Human Factors in Machine-Assisted Decision-Making

Nina Grgić-Hlača Max Planck Institute for Software Systems
18 Jan 2022, 3:00 pm - 4:00 pm
Saarbrücken building E1 4, room 029
SWS Student Defense Talks - Thesis Proposal
Machine learning (ML) based algorithms assist human decision-making in a variety of scenarios, ranging from medical diagnostics to bail decision-making. The potential societal impact of using machine decision aids in real-world settings sparked concerns about their accuracy and fairness, and inspired a flurry of research on algorithmic fairness, accountability and transparency. However, in many settings, algorithms do not make decisions, but only assist human decision-makers. In this thesis, we go beyond studying the fairness and accuracy of decision aids, ...
Machine learning (ML) based algorithms assist human decision-making in a variety of scenarios, ranging from medical diagnostics to bail decision-making. The potential societal impact of using machine decision aids in real-world settings sparked concerns about their accuracy and fairness, and inspired a flurry of research on algorithmic fairness, accountability and transparency. However, in many settings, algorithms do not make decisions, but only assist human decision-makers. In this thesis, we go beyond studying the fairness and accuracy of decision aids, and study machine-assisted human decision-making as a whole. Specifically, we study how people perceive and utilize machine decision aids.

Please contact the MPI-SWS Office Team for the ZOOM link information.
Read more