Events 2022

Automatically Detecting and Mitigating Issues in Program Analyzers

Numair Mansur Max Planck Institute for Software Systems
25 Jul 2022, 1:00 pm - 2:00 pm
Virtual talk
SWS Student Defense Talks - Thesis Proposal
Recent years have seen tremendous progress in the development and industrial adoption of rigorous techniques and tools to automatically detect bugs in software and ensure its correctness. Static program analysis is one such technique to automatically reason about the runtime behavior of a program without actually running it. It is a powerful approach that can be used to detect a wide range of security vulnerabilities and enables program optimization in compilers, automatic parallelization, and, if accurate enough, ...
Recent years have seen tremendous progress in the development and industrial adoption of rigorous techniques and tools to automatically detect bugs in software and ensure its correctness. Static program analysis is one such technique to automatically reason about the runtime behavior of a program without actually running it. It is a powerful approach that can be used to detect a wide range of security vulnerabilities and enables program optimization in compilers, automatic parallelization, and, if accurate enough, correctness verification. Static program analyzers are tools that implement program analysis techniques and are used to analyze other programs for flaws. Decades of research have yielded the discovery of novel algorithms, data structures, and design principles that make static program analysis tools and techniques more precise, scalable, and faster than ever before.

While the potential benefits of analyzers are obvious, their usability and effectiveness in mainstream software development workflows still often comes into question. This is because the undecidability of the halting problem makes formal reasoning about program properties difficult. Therefore, one common theme in static analysis is that to remain computable, one has to sacrifice either soundness or completeness. This means that, in practice, a static analysis technique might either miss a bug (false negative) or report correct code as having a bug (false positive), might not handle certain language features, or might only report certain types of bugs.

Practical program analyzers have to make tradeoffs (e.g., in soundness, precision, or performance) to maintain a delicate balance between returning the minimum number of false negatives/positives, scaling to very large industrial codebases, being highly automatic, and having minimum overhead for the developers. Designing, implementing, and deploying program analyzers, therefore, is an extremely challenging task. This makes them extremely complicated pieces of software with a high likelihood of having correctness bugs themselves (challenge 1) or tradeoffs not suitable for every piece of code under every usage scenario (challenge 2).

In this dissertation, we focus our attention on improving the state of the art in the above two challenge areas that can prevent typical software development teams from using these tools to their full potential. In the first part of the dissertation, we present algorithms to detect correctness bugs in fundamental program analysis components such as SMT solvers and Datalog engines. Correctness issues in program analyzers can have disastrous consequences e.g. when analyzing software used for electronic voting, financial systems, transportation, or secure communication. In the second part, we introduce techniques to reduce friction in the integration of program analyzers in everyday software development workflows. Smoothly integrating and tuning an advanced program analyzer for a particular codebase under certain resource constraints and different usage scenarios can be a challenging task for software developers, most of whom lack an advanced understanding of these analyzers.
Read more

Consecutive integers divisible by a power of their largest prime factor

Jean-Marie De Koninck Université Laval
11 Jul 2022, 2:00 pm - 3:30 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
SWS Colloquium
Given an integer n ≥ 2, let P (n) stand for its largest prime factor. Given integers k ≥ 2 and ℓ ≥ 2, consider the set Ek,ℓ of those integers n ≥ 2 for which P (n + i)ℓ | n + i for i = 0, 1, . . . , k. Each of these sets is very small. For instance, the smallest element of E3,2 is 1 294 298, the smallest known element of E3,3 has 77 digits and no elements of E4,2 are known, ...
Given an integer n ≥ 2, let P (n) stand for its largest prime factor. Given integers k ≥ 2 and ℓ ≥ 2, consider the set Ek,ℓ of those integers n ≥ 2 for which P (n + i)ℓ | n + i for i = 0, 1, . . . , k. Each of these sets is very small. For instance, the smallest element of E3,2 is 1 294 298, the smallest known element of E3,3 has 77 digits and no elements of E4,2 are known, even though all these sets are believed to be infinite. In this talk, using elementary, analytic and probabilistic approaches, we will shed some light on these sets and raise several open problems. This is joint work with Nicolas Doyon, Florian Luca and Matthieu Moineau.

---

This talk will be a hybrid event. You can join the meeting in E1 5 room 002 in Saarbrücken, in G26 room 111 in Kaiserlautern or via Zoom. Please contact the Office team for Zoom link information.
Read more

Information System Security: Beyond the Sum of its Parts Analyze, Measure, Explain, Improve

Tobias Fiebig MPI-INF - D3
06 Jul 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
Building secure and dependable information systems is one of the biggest challenges of the 21st century. With ransomware attacks paralyzing whole countries and digital infrastructures for health and essential services under threat, information system security is no longer an academic question, but a necessity.

In this talk, we take a look at those who build secure digital environments: System Administrators. These information workers face multi-facetted challenges, and understanding how they (can) build secure system stretches beyond methods of individual disciplines. ...
Building secure and dependable information systems is one of the biggest challenges of the 21st century. With ransomware attacks paralyzing whole countries and digital infrastructures for health and essential services under threat, information system security is no longer an academic question, but a necessity.

In this talk, we take a look at those who build secure digital environments: System Administrators. These information workers face multi-facetted challenges, and understanding how they (can) build secure system stretches beyond methods of individual disciplines. Secure cryptographic algorithms may be of no use if their key material is accidentally published. Software updates cannot prevent a compromise if they have never been installed. An intrusion detection system may leave defenders blind if attacks hide in a storm of false positive notifications.

Following a line along prior research, we see how we have to combine methodology from various fields to scientifically approach and improve information system security. First, we see how we can analytically assess challenges for information system security. However, to quantify their impact we then have to develop and apply network measurement techniques. While security related measurements illustrate the scale of vulnerabilities— misconfigurations for example—they lack in explanatory value as to ¬why vulnerabilities occur. Using qualitative methodology here to better understand and explain the interaction of technology, organizations, and human factors then allows us to take a practical perspective on security challenges. Ultimately, this enables us to develop and advocate for solutions that do not only provide the right technology, but also ensure that this technology can be effectively used to create secure information systems.
Read more

Expanding the Horizons of Finite-Precision Analysis

Debasmita Lohar Max Planck Institute for Software Systems
27 Jun 2022, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Thesis Proposal
Finite-precision programs inevitably introduce numerical uncertainties, which are usually a combination of input uncertainties due to noisy sensors of the underlying hardware and finite-precision errors that can potentially occur at every operation. While these errors are individually small, they propagate through an application and can make its final results meaningless. Furthermore, an implementation on actual hardware necessitates a trade-off between accuracy and efficiency. It is thus essential to verify that the numerical uncertainties remain acceptably small and to optimize the implementations such that the results are accurate enough for the applications. ...
Finite-precision programs inevitably introduce numerical uncertainties, which are usually a combination of input uncertainties due to noisy sensors of the underlying hardware and finite-precision errors that can potentially occur at every operation. While these errors are individually small, they propagate through an application and can make its final results meaningless. Furthermore, an implementation on actual hardware necessitates a trade-off between accuracy and efficiency. It is thus essential to verify that the numerical uncertainties remain acceptably small and to optimize the implementations such that the results are accurate enough for the applications.

There exist several static (or dynamic) analyses for finite-precision programs. The static analyses generally perform sound worst-case analysis and primarily focus on straight-line special syntax code, with limited support for conditionals and loops. In contrast, dynamic analyses usually scale well for real-world programs. However, the current dynamic analyses of finite-precision are limited to relatively small programs. Moreover, they do not provide any correctness guarantee that is crucial for critical applications.

In this thesis, we propose analysis and optimization techniques to expand the horizons of the current finite-precision analysis. Our approach captures the probability distributions of inputs and performs probabilistic analysis for small but exciting embedded and neural network examples. Using our analysis, it is possible to (conditionally) verify larger programs (with more than 2K lines of code, including complex programming and data structures). Furthermore, we enable program optimization by soundly generating quantized neural network classifiers with millions of parameters and fewer bits significantly faster than (generic) static fixed-point arithmetic tuners. Finally, we aim to scale the analysis for larger programs with probabilistic inputs.
Read more

Pushing the Barriers in Controller Synthesis for Cyber-Physical Systems

Kaushik Mallik Max Planck Institute for Software Systems
24 Jun 2022, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 113
SWS Student Defense Talks - Thesis Defense
Automated design of correct-by-construction controllers for continuous dynamical systems is a core algorithmic problem in the design of safety critical cyber-physical systems (CPS). A popular design paradigm constructs simpler but sound discrete abstractions of the original continuous systems, on which known reactive synthesis methods can be used to design controllers. The algorithms following this paradigm are collectively known as the Abstraction-Based Controller Design (ABCD) procedures. In this thesis, we build ABCD procedures which are faster and more modular compared to the state-of-the-art, ...
Automated design of correct-by-construction controllers for continuous dynamical systems is a core algorithmic problem in the design of safety critical cyber-physical systems (CPS). A popular design paradigm constructs simpler but sound discrete abstractions of the original continuous systems, on which known reactive synthesis methods can be used to design controllers. The algorithms following this paradigm are collectively known as the Abstraction-Based Controller Design (ABCD) procedures. In this thesis, we build ABCD procedures which are faster and more modular compared to the state-of-the-art, and can handle problems which were beyond the scope of the already existing techniques.

The thesis has three main parts: In the first part, we address the scalability bottleneck of the existing ABCD procedures by using a multi-layered abstraction of varying granularities. We implemented this multi-layered ABCD algorithm in the tool called Mascot, using which we empirically demonstrate the significant saving in computation time achieved through our approach. In the second part of the thesis, we propose a modular design procedure of sound local controllers for a network of discrete abstract systems communicating via discrete/boolean variables and having local specifications. We propose a sound algorithm, where the abstractions negotiate a pair of local assume-guarantee contracts (and controllers as a by-product), in order to synchronize on a set of non-conflicting and correct behaviors. We show the effectiveness of our algorithm using a prototype tool, called Agnes, on a set of discrete benchmark examples. In the third part, we present a novel ABCD procedure for discrete-time nonlinear stochastic control systems and omega-regular control specifications. We first present a novel 2.5-player game abstraction for the control problem at hand. Alongside our abstraction, we present a direct and efficient symbolic algorithm for solving 2.5-player games for omega-regular specifications. We implemented our algorithms in a prototype tool called Mascot-SDS, using which we empirically show that our approach is both faster and more memory-efficient than an alternate approach that was independently developed around the same time.
Read more

Planning and Specification Problems for Multi-Robot Systems, Powered by Formal Methods

Ivan Gavran Max Planck Institute for Software Systems
24 Jun 2022, 10:30 am - 11:30 am
Kaiserslautern building E1 5, room 113
SWS Student Defense Talks - Thesis Defense
This talk will be based on my PhD work on using formal methods to enhance both programming of robotic systems and learning of autonomous agents.

After giving a broad overview of the work, I will focus on the problem of inferring specifications from examples. In particular, we will discuss the problem of inferring a linear temporal logic (LTL) formula from sets of positive and negative example traces, as well as from a set of positive examples only. ...
This talk will be based on my PhD work on using formal methods to enhance both programming of robotic systems and learning of autonomous agents.

After giving a broad overview of the work, I will focus on the problem of inferring specifications from examples. In particular, we will discuss the problem of inferring a linear temporal logic (LTL) formula from sets of positive and negative example traces, as well as from a set of positive examples only.

Building on this method, I will describe LTLTalk, a tool that helps users of a robotic system transfer their intent into an LTL specification. The approach combines the intent signals from a single demonstration and a natural language description given by a user. A set of candidate specifications is inferred by encoding the problem as a satisfiability problem for propositional logic. This set is narrowed down to a single specification through interaction with the user: the user approves or declines generated simulations of the robot's behavior in different situations.
Read more

Learning from the People: Responsibly Encouraging Adoption of Contact Tracing Apps

Elissa Redmiles Max Planck Institute for Software Systems
01 Jun 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
At the beginning of the pandemic contact tracing apps proliferated as a potential solution to scaling infection tracking and response. While significant focus was put on developing privacy protocols for these apps, relatively less attention was given to understanding why, and why not, users might adopt them. Yet, for these technological solutions to benefit public health, users must be willing to adopt these apps. In this talk I showcase the value of taking a descriptive ethics approach to setting best practices in this new domain. ...
At the beginning of the pandemic contact tracing apps proliferated as a potential solution to scaling infection tracking and response. While significant focus was put on developing privacy protocols for these apps, relatively less attention was given to understanding why, and why not, users might adopt them. Yet, for these technological solutions to benefit public health, users must be willing to adopt these apps. In this talk I showcase the value of taking a descriptive ethics approach to setting best practices in this new domain. Descriptive ethics, introduced by the field of moral philosophy, determines best practices by learning directly from the user -- observing people’s preferences and inferring best practice from that behavior -- instead of exclusively relying on experts' normative decisions. This talk presents an empirically-validated framework of user's decision inputs to adopt COVID19 contact tracing apps, including app accuracy, privacy, benefits, and mobile costs. Using predictive models of users' likelihood to install COVID apps based on quantifications of these factors, I show how high the bar is for achieving adoption. I conclude by discussing a large-scale field study in which we put our survey and experimental results into practice to help the U.S. state of Louisiana advertise their COVID app and found differential results regarding the effect of privacy transparency when prosocial vs. individualist appeals were used to encourage app adoption.
Read more

The Skolem Landscape

Joël Ouaknine Max Planck Institute for Software Systems
27 May 2022, 2:00 pm - 3:00 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
SWS Colloquium
The Skolem Problem asks how to determine algorithmically whether a given linear recurrence sequence (such as the Fibonacci numbers) has a zero. It is a central question in dynamical systems and number theory, and has many connections to other branches of mathematics and computer science. Unfortunately, its decidability has been open for nearly a century! In this talk, I will present a brief survey of what is known on the Skolem Problem and related questions, including recent and ongoing developments. ...
The Skolem Problem asks how to determine algorithmically whether a given linear recurrence sequence (such as the Fibonacci numbers) has a zero. It is a central question in dynamical systems and number theory, and has many connections to other branches of mathematics and computer science. Unfortunately, its decidability has been open for nearly a century! In this talk, I will present a brief survey of what is known on the Skolem Problem and related questions, including recent and ongoing developments.

---

This talk will be a hybrid event. You can join the meeting in E1 5 room 002 in Saarbrücken, in G26 room 111 in Kaiserlautern or via Zoom. Please contact the Office team for Zoom link information.
Read more

Orderrr! A tale of money, intrigue, and specifications

Lorenzo Alvisi Cornell University
24 May 2022, 9:30 am - 10:30 am
Saarbrücken building E1 5, room 002
simultaneous videocast to building , room / Meeting ID: Zoom
SWS Distinguished Lecture Series
Mistrust over traditional financial institutions is motivating the development of decentralized financial infrastructures based on blockchains. In particular, Consortium blockchains (such as the Linux Foundation Hyperledger and Facebook’s diem) are emerging as the approach preferred by businesses. These systems allow only a well-known set of mutually distrustful parties to add blocks to the blockchain; in this way, they aim to retain the benefits of decentralization without embracing the cyberpunk philosophy that informed Nakamoto’s disruptive vision. At the core of consortium blockchains is State Machine Replication, ...
Mistrust over traditional financial institutions is motivating the development of decentralized financial infrastructures based on blockchains. In particular, Consortium blockchains (such as the Linux Foundation Hyperledger and Facebook’s diem) are emerging as the approach preferred by businesses. These systems allow only a well-known set of mutually distrustful parties to add blocks to the blockchain; in this way, they aim to retain the benefits of decentralization without embracing the cyberpunk philosophy that informed Nakamoto’s disruptive vision. At the core of consortium blockchains is State Machine Replication, a classic technique borrowed from fault tolerant distributed computing; to ensure the robustness of their infrastructure, consortium blockchains actually borrow the Byzantine-tolerant version of this technique, which guarantees that the blockchain will operate correctly even if as many as about a third of the contributing parties are bent on cheating.

But, sometimes, "a borrowing is a sorrowing".

I will discuss why Byzantine-tolerant state machine replication is fundamentally incapable of recognizing, never mind preventing, an ever present scourge of financial exchanges: the fraudulent manipulation of the order in which transactions are processed — and how its specification needs to be expanded to give it a fighting chance.

But is it possible to completely eliminate the ability of Byzantine parties to engage in order manipulation? What meaningful ordering guarantees can be enforced? And at what cost?

---

This talk will be a hybrid event. You can either attend in room 002 or via Zoom. Please contact the office team for zoom link information.
Read more

Towards Resilient and Sustainable Software Security

Thorsten Holz CISPA
04 May 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
In parallel with the ongoing digitization, computer security has become an increasingly important and urgent challenge. In particular, the sound and robust implementation of complex software systems is still not well understood in practice, as evidenced by the steady stream of successful attacks observed in the wild. The current state of the art in software security consists of solutions that are often technically sound, but do not provide operational security in practice.

In this talk, ...
In parallel with the ongoing digitization, computer security has become an increasingly important and urgent challenge. In particular, the sound and robust implementation of complex software systems is still not well understood in practice, as evidenced by the steady stream of successful attacks observed in the wild. The current state of the art in software security consists of solutions that are often technically sound, but do not provide operational security in practice.

In this talk, I will give an overview of our recent work towards resilient and sustainable software security, which is also the focus of my upcoming ERC Consolidator project. On the one hand, the system must be resilient against entire classes of attack vectors. On the other hand, the system must be sustainable, i.e., it must be able to maintain its security at least over its design lifetime and possibly even adapt over time. In the talk, I will discuss some of the methods we have developed to achieve this goal, with a specific focus on novel software testing strategies that enable accurate and efficient vulnerability discovery. I will conclude the talk with a brief overview of future research directions.
Read more

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Long Tran-Thanh University of Warwick
07 Apr 2022, 11:00 am - 12:00 pm
Virtual talk
SWS Colloquium
In this paper, we study how bandit algorithms in a bounded reward setting can be contaminated and fooled to learn wrong actions. In particular, we consider a strong attacker model in which the attacker aims to fool the learning algorithm to learn a suboptimal bandit policy. To do so, this attacker can observe both the selected actions and their corresponding rewards, and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret O(\log T) can be forced to suffer a linear regret with an expected amount of contamination O(\log T). ...
In this paper, we study how bandit algorithms in a bounded reward setting can be contaminated and fooled to learn wrong actions. In particular, we consider a strong attacker model in which the attacker aims to fool the learning algorithm to learn a suboptimal bandit policy. To do so, this attacker can observe both the selected actions and their corresponding rewards, and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret O(\log T) can be forced to suffer a linear regret with an expected amount of contamination O(\log T). We show that this amount of contamination is also necessary, as we prove that there exists a no-regret bandit algorithm, specifically the classical UCB, that requires \Omega(\log T) amount of contamination to suffer linear regret. To combat such poisoning attacks, our second main contribution is to propose verification-based mechanisms, which use a verification scheme to access a limited number of uncontaminated rewards. In particular, for the case of unlimited verifications, we show that with O(\log T) expected number of verifications, a simple modified version of the Explore-then-Commit type bandit algorithm can restore the order optimal O(\log T) regret irrespective of the amount of contamination used by the attacker. We also provide a UCB-like verification scheme, called Secure-UCB, that also enjoys full recovery from any attacks, also with O(\log T) expected number of verifications. To derive a matching lower bound on the number of verifications, we also prove that for any order-optimal bandit algorithm, this number of verifications O(\log T) is necessary to recover the order-optimal regret. On the other hand, when the number of verifications is bounded above by a budget B, we propose a novel algorithm, Secure-BARBAR, which provably achieves O(\min\{C,T/sqrt{B} \}) regret with high probability against weak attackers (i.e., attackers who have to place the contamination \emph{before} seeing the actual pulls of the bandit algorithm), where C is the total amount of contamination by the attacker. This new result breaks the known \Omega(C) lower bound of the non-verified setting if C is large.  

This is joint work with Anshuka Rangi and Haifeng Xu.

Please contact crichter@mpi-sws.org for link information
Read more

Robust Deep Neural Networks for Computer Vision

Adam Kortylewski MPI-INF - D6
06 Apr 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
Deep learning sparked a tremendous increase in the performance of computer vision systems over the past decade. However, Deep Neural Networks (DNNs) are still far from reaching human-level performance at visual recognition tasks. The most important limitation of DNNs is that they fail to give reliable predictions in unseen or adverse viewing conditions, which would not fool a human observer, such as when objects are partially occluded, seen in an unusual pose or context, or in bad weather. ...
Deep learning sparked a tremendous increase in the performance of computer vision systems over the past decade. However, Deep Neural Networks (DNNs) are still far from reaching human-level performance at visual recognition tasks. The most important limitation of DNNs is that they fail to give reliable predictions in unseen or adverse viewing conditions, which would not fool a human observer, such as when objects are partially occluded, seen in an unusual pose or context, or in bad weather. This lack of robustness in DNNs is generally acknowledged, but the problem largely remains unsolved. In this talk, I will give an overview of the principles underlying my work on building robust deep neural networks for computer vision. My working hypothesis is that vision systems need a causal 3D understanding images by following an analysis-by-synthesis approach. I will discuss a new type of neural network architecture that implements such an approach, and I will show that these generative neural network models are vastly superior to traditional models in terms of robustness, learning efficiency and because they can solve many vision tasks at once. Finally, I will give a brief outlook on current projects of mine and future research directions.
Read more

Machine learning for algorithm design

Maria Florina Balcan Carnegie Mellon University
29 Mar 2022, 3:00 pm - 4:00 pm
Virtual talk
SWS Distinguished Lecture Series
The classic textbook approach to designing and analyzing algorithms for combinatorialproblems considers worst-case instances of the problem, about which the algorithm designer has no prior information. Since for many problems such worst-case guarantees are quite weak, practitioners often employ a data-driven algorithm design approach; specifically,they use machine learning and instances of the problem from their specific domain to learn a method that works well in that domain. Historically, such data-driven algorithmic techniques have come with no performance guarantees. ...
The classic textbook approach to designing and analyzing algorithms for combinatorialproblems considers worst-case instances of the problem, about which the algorithm designer has no prior information. Since for many problems such worst-case guarantees are quite weak, practitioners often employ a data-driven algorithm design approach; specifically,they use machine learning and instances of the problem from their specific domain to learn a method that works well in that domain. Historically, such data-driven algorithmic techniques have come with no performance guarantees. In this talk, I will describeour recent work on providing performance guarantees for data-driven algorithm design both in the distributional and online learning formalizations.

--

Please contact the office team for zoom link information.
Read more

Confidential and Private Collaborative Machine Learning

Adam Dziedzic The Vector Institute & University of Toronto
04 Mar 2022, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Colloquium
This talk outlines my work building systems that enable applications to securely interact with users' data while preserving individuals' privacy. First, I'll talk bout how we can bring the power of secure computation to difficult settings: TimeCrypt is an encrypted time-series database design that meets the scalability and low-latency requirements associated with time-series workloads. Then, I'll discuss work on using end-to-end privacy as a strong foundation for data protection: Zeph is a new end-to-end privacy system that provides the means to extract value from encrypted streaming data safely while ensuring data confidentiality and privacy by serving only privacy-compliant views of the data. ...
This talk outlines my work building systems that enable applications to securely interact with users' data while preserving individuals' privacy. First, I'll talk bout how we can bring the power of secure computation to difficult settings: TimeCrypt is an encrypted time-series database design that meets the scalability and low-latency requirements associated with time-series workloads. Then, I'll discuss work on using end-to-end privacy as a strong foundation for data protection: Zeph is a new end-to-end privacy system that provides the means to extract value from encrypted streaming data safely while ensuring data confidentiality and privacy by serving only privacy-compliant views of the data. Throughout the talk, I will discuss the prevalent challenges of efficiency, functionality, and accessibility in this research area; my approach to addressing these challenges; and future directions that will help bring end-to-end privacy to an even wider range of applications.

Please contact MPI-SWS Office Team for link information
Read more

Systems Designs for End-to-End Privacy

Anwar Hithnawi ETH Zurich
03 Mar 2022, 10:00 am - 11:00 am
Virtual talk
CIS@MPG Colloquium
The potential of data to transform science and society has spurred unparalleled efforts to collect it in increasingly sensitive and granular forms. This accumulation of sensitive data did not materialize without issues and has raised severe societal concerns. These concerns appear amply justified by numerous reports of data breaches and misuse. Today, we are at an inflection point: if we want to continue enjoying the benefits of data-driven services, we need to place privacy at the center of our data ecosystems. ...
The potential of data to transform science and society has spurred unparalleled efforts to collect it in increasingly sensitive and granular forms. This accumulation of sensitive data did not materialize without issues and has raised severe societal concerns. These concerns appear amply justified by numerous reports of data breaches and misuse. Today, we are at an inflection point: if we want to continue enjoying the benefits of data-driven services, we need to place privacy at the center of our data ecosystems.

This talk outlines my work building systems that enable applications to securely interact with users' data while preserving individuals' privacy. First, I'll talk bout how we can bring the power of secure computation to difficult settings: TimeCrypt is an encrypted time-series database design that meets the scalability and low-latency requirements associated with time-series workloads. Then, I'll discuss work on using end-to-end privacy as a strong foundation for data protection: Zeph is a new end-to-end privacy system that provides the means to extract value from encrypted streaming data safely while ensuring data confidentiality and privacy by serving only privacy-compliant views of the data. Throughout the talk, I will discuss the prevalent challenges of efficiency, functionality, and accessibility in this research area; my approach to addressing these challenges; and future directions that will help bring end-to-end privacy to an even wider range of applications.

---

Please contact the MPI-SWS Office Team for the ZOOM link information.
Read more

Empowering People to Have Secure and Private Interactions with Digital Technologies

Pardis Emami-Naeini University of Washington
02 Mar 2022, 5:00 pm - 6:00 pm
Virtual talk
CIS@MPG Colloquium
Digital technologies are evolving with advanced capabilities. To function, these technologies rely on collecting and processing various types of sensitive data from their users. These data practices could expose users to a wide array of security and privacy risks. My research at the intersection of security, privacy, and human-computer interaction aims to help all people have safer interactions with digital technologies. In this talk, I will share results on people’s security and privacy preferences and attitudes toward technologies such as smart devices and remote communication tools. ...
Digital technologies are evolving with advanced capabilities. To function, these technologies rely on collecting and processing various types of sensitive data from their users. These data practices could expose users to a wide array of security and privacy risks. My research at the intersection of security, privacy, and human-computer interaction aims to help all people have safer interactions with digital technologies. In this talk, I will share results on people’s security and privacy preferences and attitudes toward technologies such as smart devices and remote communication tools. I will then describe a security and privacy transparency tool that I designed and evaluated to address consumers’ needs when purchasing and interacting with smart devices. I will end my talk by discussing emerging and future directions for my research to design equitable security and privacy tools and policies by studying and designing for the needs of diverse populations.

Please contact MPI-SWS Office Team for link information
Read more

Improving People’s Adoption of Security and Privacy Behaviors

Yixin Zou University of Michigan
01 Mar 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Experts recommend a plethora of advice for staying safe online, yet people still use weak passwords, fall for scams, or ignore software updates. Such inconsistent adoption of protective behaviors is understandable given the need to navigate other priorities and constraints in everyday life. Yet when the actions taken are insufficient to mitigate potential risks, it leaves people – especially those already marginalized – vulnerable to dire consequences from financial loss to abuse and harassment.

In this talk, ...
Experts recommend a plethora of advice for staying safe online, yet people still use weak passwords, fall for scams, or ignore software updates. Such inconsistent adoption of protective behaviors is understandable given the need to navigate other priorities and constraints in everyday life. Yet when the actions taken are insufficient to mitigate potential risks, it leaves people – especially those already marginalized – vulnerable to dire consequences from financial loss to abuse and harassment.

In this talk, I share findings from my research on hurdles that prevent people from adopting secure behaviors and solutions that encourage adoption in three domains: designing data breach notifications, informing privacy interface guidelines in regulations, and supporting survivors of tech-enabled abuse. (1) Even small changes in system design can make a big difference. I empirically show consumers’ low awareness of data breaches, rational justifications and biases behind inaction, and how to motivate consumers to change breached passwords through nudges in breach notifications. (2) Public policy is essential in incentivizing companies to implement better data practices, but policymaking needs to be informed by evidence from research. I present a series of user studies that led to a user-tested icon for conveying the "do not sell my personal information" opt-out, now part of the California Consumer Privacy Act (CCPA). (3) Different user groups have different threat models and safety needs, requiring special considerations in developing and deploying interventions. Drawing on findings from focus groups, I discuss how computer security support agents can help survivors of tech-enabled abuse using a trauma-informed approach. Altogether, I highlight the impact of my research on technology design, public policy, and educational efforts. I end the talk by discussing how my interdisciplinary, human-centered approach in solving security and privacy challenges can apply to future work such as improving expert advice and developing trauma-informed computing systems.

Please contact MPI-SWS Office Team for link information
Read more

Specification-Guided Policy-Synthesis

Suguman Bansal University of Pennsylvania
28 Feb 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Policy synthesis or algorithms to design policies for computational systems is one of the fundamental problems in computer science. Standing on the shoulders of simplified yet concise task-specification using high-level logical specification languages, this talk will cover synthesis algorithms using two contrasting approaches. First, the classical logic-based approach of reactive synthesis; Second, the modern learning-based approach of reinforcement learning. This talk will cover our scalable and efficient state-of-the-art algorithms for synthesis from high-level specifications using both these approaches, ...
Policy synthesis or algorithms to design policies for computational systems is one of the fundamental problems in computer science. Standing on the shoulders of simplified yet concise task-specification using high-level logical specification languages, this talk will cover synthesis algorithms using two contrasting approaches. First, the classical logic-based approach of reactive synthesis; Second, the modern learning-based approach of reinforcement learning. This talk will cover our scalable and efficient state-of-the-art algorithms for synthesis from high-level specifications using both these approaches, and investigate whether formal guarantees are possible. We will conclude with a forward-looking view of these contributions to trustworthy AI.

Please contact MPI-SWS Office Team for link information
Read more

Grounding Language by Seeing, Hearing, and Interacting

Rowan Zellers University of Washington
23 Feb 2022, 4:00 pm - 5:00 pm
Virtual talk
CIS@MPG Colloquium
As humans, our understanding of language is grounded in a rich mental model about "how the world works" – that we learn through perception and interaction. We use this understanding to reason beyond what is literally said, imagining how situations might unfold in the world. Machines today struggle at making such connections, which limits how they can be safely used. In my talk, I will discuss three lines of work to bridge this gap between machines and humans. ...
As humans, our understanding of language is grounded in a rich mental model about "how the world works" – that we learn through perception and interaction. We use this understanding to reason beyond what is literally said, imagining how situations might unfold in the world. Machines today struggle at making such connections, which limits how they can be safely used. In my talk, I will discuss three lines of work to bridge this gap between machines and humans. I will first discuss how we might measure grounded understanding. I will introduce a suite of approaches for constructing benchmarks, using machines in the loop to filter out spurious biases. Next, I will introduce PIGLeT: a model that learns physical commonsense understanding by interacting with the world through simulation, using this knowledge to ground language. PIGLeT learns linguistic form and meaning – together – and outperforms text-to-text only models that are orders of magnitude larger. Finally, I will introduce MERLOT, which learns about situations in the world by watching millions of YouTube videos with transcribed speech. The model learns to jointly represent video, audio, and language, together and over time – learning multimodal and neural script knowledge representations. Together, these directions suggest a path forward for building machines that learn language rooted in the world.

Please contact MPI-SWS Office Team for link information
Read more

Measurement and Experimentation in Complex Sociopolitical Processes

Aaron Schein Columbia University
22 Feb 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Complex social and political processes at many scales—from interpersonal networks of friends to international networks of countries—are a central theme of computational social science. Modern methods of data science that can contend with the complexity of data from such processes have the potential to break ground on long-standing questions of critical relevance to public policy. In this talk, I will present two lines of work on 1) estimating the causal effects of friend-to-friend mobilization in US elections, ...
Complex social and political processes at many scales—from interpersonal networks of friends to international networks of countries—are a central theme of computational social science. Modern methods of data science that can contend with the complexity of data from such processes have the potential to break ground on long-standing questions of critical relevance to public policy. In this talk, I will present two lines of work on 1) estimating the causal effects of friend-to-friend mobilization in US elections, and 2) inferring complex latent structure in dyadic event data of country-to-country interactions. In the first part, I will discuss recent work using large-scale digital field experiments on the mobile app Outvote to estimate the causal effects of friend-to-friend texting on voter turnout in the 2018 and 2020 US elections. This work is among the first to rigorously assess the effectiveness of friend-to-friend "get out the vote" tactics, which political campaigns have increasingly embraced in recent elections. I will discuss the statistical challenges inherent to randomizing interactions between friends with a "light touch" design and will describe the methodology we developed to identify and precisely estimate causal effects. In the second part of this talk, I will discuss hierarchical Bayesian modeling of dyadic event data sets in international relations which contain millions of micro-records of the form "country i took action a to country j at time t". The models I will discuss blend elements of tensor decomposition and dynamical systems to capture complex temporal and network dependence structure in the data. Approximate posterior inference relies on new auxiliary variable augmentation schemes and theorems about the properties and relationships between different discrete distributions. At the end of the talk, I will outline the future of both lines of work, as well as their intersection, and sketch a broader vision for how data science can serve computational social science and vice versa.

Please contact MPI-SWS Office Team for link information
Read more

Language theory into practice, a play in three acts

Ningning Xie University of Cambridge
21 Feb 2022, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Colloquium
Computer development has come a long way. Along with the evolution of computers, advances in high-level programming languages allow us to write large-scale software systems easily. While new language features significantly extend the language expressive power, they often lack theoretical development and lead to subtle implementation bugs. Moreover, while high-level languages abstract over low-level aspects and thus eliminate many sources of errors, the abstraction often comes with a runtime penalty that results in inefficient low-level code. ...
Computer development has come a long way. Along with the evolution of computers, advances in high-level programming languages allow us to write large-scale software systems easily. While new language features significantly extend the language expressive power, they often lack theoretical development and lead to subtle implementation bugs. Moreover, while high-level languages abstract over low-level aspects and thus eliminate many sources of errors, the abstraction often comes with a runtime penalty that results in inefficient low-level code. In this talk, I will show how to apply programming language theory to practical programming to offer strong static safety and efficiency guarantees in three domains: language design, runtime systems, and machine learning systems. First, I will demonstrate a type-theoretical formalization of language features, focusing on type inference for dependent types in algebraic datatype declarations. The formalization has guided real-world language implementations. Then, I will show that programming language theory reaps benefits beyond safety. I will present Perceus, a garbage-free reference counting algorithm with reuse, that supports high-level programming while preserving low-level efficiency. Perceus delivers competitive performance compared to state-of-the-art memory reclamation implementations. Finally, as part of a vision to make programming languages broadly applicable, I will discuss my efforts to apply programming language techniques to machine learning systems, by presenting a program synthesis framework that accelerates large-scale distributed machine learning on hardware platforms.

Please contact MPI-SWS Office Team for link information
Read more

Improving Robustness in Machine Learning Models

Yao Qin Google Research
18 Feb 2022, 4:00 pm - 5:00 pm
Virtual talk
CIS@MPG Colloquium
There are many robustness issues arising in a variety of forms while deploying ML systems in the real world. For example, neural networks suffer from distributional shift — a model is tested on a data distribution different from what it was trained on. In addition, neural networks are vulnerable to adversarial examples – small perturbations to the input can successfully fool classifiers into making incorrect predictions. In this talk, I will introduce how to improve robustness of machine learning models by building connections between different perspectives of robustness issues and bridging gaps between a wide range of modalities. ...
There are many robustness issues arising in a variety of forms while deploying ML systems in the real world. For example, neural networks suffer from distributional shift — a model is tested on a data distribution different from what it was trained on. In addition, neural networks are vulnerable to adversarial examples – small perturbations to the input can successfully fool classifiers into making incorrect predictions. In this talk, I will introduce how to improve robustness of machine learning models by building connections between different perspectives of robustness issues and bridging gaps between a wide range of modalities. As a result, seemingly different robustness issues can be tackled by closely-related approaches, and robust ML on multiple modalities and backbone architectures can converge to a common ground.

Please contact MPI-SWS Office Team for link information
Read more

Characterizing and Mitigating Threats to Trust and Safety Online

Yiqing Hua Cornell University
15 Feb 2022, 10:00 am - 11:00 am
Virtual talk
CIS@MPG Colloquium
Supporting a safe and trustworthy online environment is challenging, as these environments are constantly threatened by abusive behaviors that cause real human harm. Among the numerous threats, online harassment suppresses voices, and misleading information and propaganda undermine public trust. Existing methods to combat these threats are often not sufficient, as adversaries may abuse and exploit technologies in nuanced ways, and mitigation strategies don't always reflect users' needs. In this talk, I will present my work on characterizing threats, ...
Supporting a safe and trustworthy online environment is challenging, as these environments are constantly threatened by abusive behaviors that cause real human harm. Among the numerous threats, online harassment suppresses voices, and misleading information and propaganda undermine public trust. Existing methods to combat these threats are often not sufficient, as adversaries may abuse and exploit technologies in nuanced ways, and mitigation strategies don't always reflect users' needs. In this talk, I will present my work on characterizing threats, and empowering users with new techniques to combat these threats. First, I will discuss the challenges faced in detecting subtle harassment on social media, and my approach to context-specific analysis using the United States 2018 general election as a case study. Second, I will demonstrate the importance of characterizing user participation in adversarial activities, to inform better moderation mechanism design. Lastly, I will introduce my work on developing privacy-preserving abuse mitigation techniques, to allow user-side warnings of misinformation images in end-to-end encrypted environments.

Please contact MPI-SWS Office Team for link information
Read more

How do neurons learn?

Hadi Daneshmand Princeton University
11 Feb 2022, 12:00 pm - 1:00 pm
Virtual talk
CIS@MPG Colloquium
Representation learning with neural networks automates feature extraction with less need for continental feature engineering; thereby achieving incredible performance in image, text, and strategy processing. However, the underlying mechanism of representation learning is not well understood. This limits applications of representation learning in critical tasks, such as cancer diagnoses and other medical decisions. In this talk, we propose a research plan for studying representation learning with three core research focuses: Random neural networks. By studying random neural networks, ...
Representation learning with neural networks automates feature extraction with less need for continental feature engineering; thereby achieving incredible performance in image, text, and strategy processing. However, the underlying mechanism of representation learning is not well understood. This limits applications of representation learning in critical tasks, such as cancer diagnoses and other medical decisions. In this talk, we propose a research plan for studying representation learning with three core research focuses: Random neural networks. By studying random neural networks, we shed light on the inner workings of the incredible performance of modern neural networks. We demonstrate how the study of random networks allows us to go beyond the conventional trial and error development of neural networks. Local optimality. Given a neural network, is it possible to improve its performance only by slight modifications of the network parameters? This is the focus of local optimization for representation learning. Our research highlights that local optimization requires more studies in modern representation learning with generative adversarial networks. Modeling. A mathematical study of learning dynamics is very challenging. Modeling facilitates the study of learning dynamics by omitting technical details of learning. For example, a continuous-time dynamical system may model an iterative learning method, bridging the gap between dynamical systems and representation learning.

Please contact MPI-SWS Office Team for link information
Read more

Toward Reliable Machine Learning with Kernels

Krikamol Muandet MPI-IS
11 Feb 2022, 9:30 am - 10:30 am
Virtual talk
CIS@MPG Colloquium
Society is made up of a set of diverse individuals, demographic groups, and institutions. Therefore, learning and deploying algorithmic models across heterogeneous environments face a set of various trade-offs. In order to develop reliable machine learning algorithms that can interact successfully with the real world, it is necessary to deal with changes in underlying data-generating distributions. This talk will be about the kernel mean embedding (KME), a nonparametric kernel-based framework to represent probability distributions and model changes thereof. ...
Society is made up of a set of diverse individuals, demographic groups, and institutions. Therefore, learning and deploying algorithmic models across heterogeneous environments face a set of various trade-offs. In order to develop reliable machine learning algorithms that can interact successfully with the real world, it is necessary to deal with changes in underlying data-generating distributions. This talk will be about the kernel mean embedding (KME), a nonparametric kernel-based framework to represent probability distributions and model changes thereof. In particular, I will focus on how this framework can help improve the credibility of algorithmic decision-making by enabling us to reason about higher-order causal effects of policy interventions as well as by removing the effect of unobserved confounders through the use of an instrumental variable (IV). Lastly, I will argue that a better understanding of the ways in which our data are generated and how our models can influence them will be crucial for reliable machine learning systems, especially when gaining full information about data may not be possible.

Please contact MPI-SWS Office Team for link information
Read more

Strengthening and Enriching Machine Learning for Cybersecurity

Wenbo Guo Penn State University
10 Feb 2022, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Colloquium
Nowadays, security researchers are increasingly using AI to automate and facilitate security analysis. Although making some meaningful progress, AI has not maximized its capability in security yet, mainly due to two challenges. First, existing ML techniques have not reached security professionals' requirements in critical properties, such as interpretability and adversary-resistancy. Second, Security data imposes many new technical challenges, and these challenges break the assumptions of existing ML models and thus jeopardize their efficacy. In this talk, ...
Nowadays, security researchers are increasingly using AI to automate and facilitate security analysis. Although making some meaningful progress, AI has not maximized its capability in security yet, mainly due to two challenges. First, existing ML techniques have not reached security professionals' requirements in critical properties, such as interpretability and adversary-resistancy. Second, Security data imposes many new technical challenges, and these challenges break the assumptions of existing ML models and thus jeopardize their efficacy. In this talk, I will describe my research efforts to address the above challenges, with a primary focus on strengthening the interpretability of ML-based security systems and enriching ML to detect concept drifts in security data. Regarding interpretability, I will describe our explanation methods for deep learning-based and deep reinforcement learning-based security systems and demonstrate how security analysts could benefit from these methods to establish trust in blackbox models and patching model vulnerabilities. As for concept drifts, I will introduce a novel ML system to detect and explain drifting samples and demonstrate its application in a real-world malware database. Finally, I will conclude by highlighting my future plan towards maximizing the capability of advanced ML in cybersecurity.

Please contact MPI-SWS Office Team for link information
Read more

Robust and Accountable Multi-Agent Sequential Decision Making

Goran Radanovic Max Planck Institute for Software Systems
02 Feb 2022, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
The recent success stories of reinforcement learning in game playing have demonstrated the utility of the reinforcement learning framework in deriving scalable solutions to multi-agent sequential decision making problems. However, applying these solutions beyond simulated environments requires additional building blocks that would enable trustworthy decision making. In this talk, I will present some of our recent results that relate to robustness and accountability---two properties that any decision making system ought to satisfy in order to be deemed trustworthy. ...
The recent success stories of reinforcement learning in game playing have demonstrated the utility of the reinforcement learning framework in deriving scalable solutions to multi-agent sequential decision making problems. However, applying these solutions beyond simulated environments requires additional building blocks that would enable trustworthy decision making. In this talk, I will present some of our recent results that relate to robustness and accountability---two properties that any decision making system ought to satisfy in order to be deemed trustworthy. These results showcase some of the challenges in designing agents and support systems for robust and accountable multi-agent sequential decision making.
Read more

Software for Fast Storage Hardware

Willy Zwaenepoel University of Sydney
20 Jan 2022, 10:00 am - 11:00 am
Virtual talk
SWS Distinguished Lecture Series
Storage technologies are entering the market with performance vastly superior to conventional storage devices. This technology shift requires a complete rethinking of the software storage stack.

In this talk I will give two examples of our work with Optane-based solid-state (block) devices that illustrate the need for and the benefit of a wholesale redesign.

First, I will describe the Kvell key-value (KV) store. The key observation underlying Kvell is that conventional KV software on fast devices is bottlenecked by the CPU rather than by the device. ...
Storage technologies are entering the market with performance vastly superior to conventional storage devices. This technology shift requires a complete rethinking of the software storage stack.

In this talk I will give two examples of our work with Optane-based solid-state (block) devices that illustrate the need for and the benefit of a wholesale redesign.

First, I will describe the Kvell key-value (KV) store. The key observation underlying Kvell is that conventional KV software on fast devices is bottlenecked by the CPU rather than by the device. Kvell therefore focuses on minimizing CPU intervention.

Second, I will describe the Kvell+ OLTP/OLAP system built on top of Kvell. The key underlying observation here is that these storage devices have become so fast that the conventional implementation of snapshot isolation – maintaining multiple versions – quickly leads to the device filling up. Kvell therefore focuses processes new versions as they are created.

This talk describes joint work with Oana Balmau (McGill University), Karan Gupta (Nutanix) and Baptiste Lepers (University of Sydney).

---

Please contact the MPI-SWS Office Team for the ZOOM link information. .
Read more

Human Factors in Machine-Assisted Decision-Making

Nina Grgić-Hlača Max Planck Institute for Software Systems
18 Jan 2022, 3:00 pm - 4:00 pm
Saarbrücken building E1 4, room 029
SWS Student Defense Talks - Thesis Proposal
Machine learning (ML) based algorithms assist human decision-making in a variety of scenarios, ranging from medical diagnostics to bail decision-making. The potential societal impact of using machine decision aids in real-world settings sparked concerns about their accuracy and fairness, and inspired a flurry of research on algorithmic fairness, accountability and transparency. However, in many settings, algorithms do not make decisions, but only assist human decision-makers. In this thesis, we go beyond studying the fairness and accuracy of decision aids, ...
Machine learning (ML) based algorithms assist human decision-making in a variety of scenarios, ranging from medical diagnostics to bail decision-making. The potential societal impact of using machine decision aids in real-world settings sparked concerns about their accuracy and fairness, and inspired a flurry of research on algorithmic fairness, accountability and transparency. However, in many settings, algorithms do not make decisions, but only assist human decision-makers. In this thesis, we go beyond studying the fairness and accuracy of decision aids, and study machine-assisted human decision-making as a whole. Specifically, we study how people perceive and utilize machine decision aids.

Please contact the MPI-SWS Office Team for the ZOOM link information.
Read more