Recent events

Humans and Machines: From Data Elicitation to Helper-AI

Goran Radanovic
Harvard University
SWS Colloquium
16 May 2019, 2:00 pm - 3:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
Recent AI advances have been driven by high-quality input data, often labeled by human annotators. A fundamental challenge in eliciting high-quality information from humans is that there is often no way to directly verify the quality of the information they provide. Consider, for example, product reviews and marketing surveys where data is inherently subjective, environmental community sensing where data is highly localized, and geopolitical forecasting where the ground truth is revealed in the distant future. In these settings, data elicitation has to rely on peer-consistency mechanisms, which incentivize high-quality reporting by examining the consistency between the reports of different data providers. In this talk, I will discuss some of the recent advances in peer-consistency designs. Furthermore, I will outline some thoughts on an agenda around the design of human-AI collaborative systems.

Programming Abstractions for Verifiable Software

Damien Zufferey
Max Planck Institute for Software Systems
Joint Lecture Series
15 May 2019, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
In this talk, I will show how we can harness the synergy between programming languages and verification methods to help programmers build reliable software. First, we will look at fault-tolerant distributed algorithms. These algorithms are central to any high-availability application but they are notoriously difficult to implement due to asynchronous communication and faults. A fault- tolerant consensus algorithms which can be described in ~50 lines of pseudo code can easily turns into a few thousand lines of actual code. To remediate this, I will introduce PSync a domain specific language for fault-tolerant distributed algorithms. The key insight is the use of communication-closure (logical boundaries in a program that messages should not cross) to structure the code. Communication-closure gives a syntactic scope to the communication, provides some form of logical time, and give the illusion of synchrony. These element greatly simplify the programming and verification of fault-tolerant algorithms. In the second part of the talk, we will discuss a new project exploring how advances in rapid prototyping (3D printers) may impact how we develop software for robots. These advances may soon be enable adding computational elements as part of the internal structure of objects. The goal of this project is to rethink the software/hardware boundary and integrate the two together. I will present a system we are developing where components integrate for geometry (hardware) and behavior (software). The system allows from bottom-up composition and top-down decomposition. The bottom-up composition connects components together to achieve more complex behaviors. The top-down decomposition project a global specification on the individual components and performs verification at the level of individual components.

Conclave: Secure Multi-Party Computation on Big Data

Nikolaj Volgushev
Boston University
SWS Colloquium
15 May 2019, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
Secure Multi-Party Computation (MPC) allows mutually distrusting parties to run joint computations without revealing private data. Current MPC algorithms scale poorly with data size, which makes MPC on "big data" prohibitively slow and inhibits its practical use. Many relational analytics queries can maintain MPC’s end-to-end security guarantee without using cryptographic MPC techniques for all operations. Conclave is a query compiler that accelerates such queries by transforming them into a combination of data-parallel, local cleartext processing and small MPC steps. When parties trust others with specific subsets of the data, Conclave applies new hybrid MPC-cleartext protocols to run additional steps outside of MPC and improve scalability further. Our Conclave prototype generates code for cleartext processing in Python and Spark, and for secure MPC using the Sharemind and Obliv-C frameworks. Conclave scales to data sets between three and six orders of magnitude larger than state-of-the-art MPC frameworks support on their own. Thanks to its hybrid protocols and additional optimizations, Conclave also substantially outperforms SMCQL, the most similar existing system.

A constructive proof of dependent choice in classical arithmetic via memoization

Étienne Miquey
Inria, Nantes
SWS Colloquium
09 May 2019, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
In 2012, Herbelin developed a calculus (dPAω) in which constructive proofs for the axioms of countable and dependent choices can be derived via the memoization of choice functions However, the property of normalization (and therefore the one of soundness) was only conjectured. The difficulty for the proof of normalization is due to the simultaneous presence of dependent dependent types (for the constructive part of the choice), of control operators (for classical logic), of coinductive objects (to encode functions of type ℕ→A into streams (a₀,a₁,…)) and of lazy evaluation with sharing (for these coinductive objects). Building on previous works, we introduce a variant of dPAω presented as a sequent calculus. On the one hand, we take advantage of a variant of Krivine classical realizability we developed to prove the normalization of classical call-by-need. On the other hand, we benefit of dLtp, a classical sequent calculus with dependent types in which type safety is ensured using delimited continuations together with a syntactic restriction. By combining the techniques developed in these papers, we manage to define a realizability interpretation à la Krivine of our calculus that allows us to prove normalization and soundness. This talk will go over the whole process, starting from Herbelin’s calculus dPAω until the introduction of its sequent calculus counterpart dLPAω that we prove to be sound.

Combinatorial Constructions for Effective Testing

Filip Nikšic
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
03 May 2019, 3:30 pm - 4:45 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312
Large-scale distributed systems consist of a number of components, take a number of parameter values as input, and behave differently based on a number of non-deterministic events. All these features—components, parameter values, and events—interact in com- plicated ways, and unanticipated interactions may lead to bugs. Empirically, many bugs in these systems are caused by interactions of only a small number of features. In certain cases, it may be possible to test all interactions of k features for a small constant k by executing a family of tests that is exponentially or even doubly-exponentially smaller than the family of all tests. Thus, in such cases we can effectively uncover all bugs that require up to k-wise interactions of features.

In this thesis we study two occurrences of this phenomenon. First, many bugs in distributed systems are caused by network partition faults. In most cases these bugs occur due to two or three key nodes, such as leaders or replicas, not being able to communicate, or because the leading node finds itself in a block of the partition without quorum. Second, bugs may occur due to unexpected schedules (interleavings) of concurrent events— concurrent exchange of messages and concurrent access to shared resources. Again, many bugs depend only on the relative ordering of a small number of events. We call the smallest number of events whose ordering causes a bug the depth of the bug. We show that in both testing scenarios we can effectively uncover bugs involving small number of nodes or bugs of small depth by executing small families of tests.

We phrase both testing scenarios in terms of an abstract framework of tests, testing goals, and goal coverage. Sets of tests that cover all testing goals are called covering families. We give a general construction that shows that whenever a random test covers a fixed goal with sufficiently high probability, a small randomly chosen set of tests is a covering family with high probability. We then introduce concrete coverage notions relating to network partition faults and bugs of small depth. In case of network partition faults, we show that for the introduced coverage notions we can find a lower bound on the probability that a random test covers a given goal. Our general construction then yields a randomized testing procedure that achieves full coverage—and hence, find bugs— quickly.

In case of coverage notions related to bugs of small depth, if the events in the program form a non-trivial partial order, our general construction may give a suboptimal bound. Thus, we study other ways of constructing convering families. We show that if the events in a concurrent program are partially ordered as a tree, we can explicitly construct a covering family of small size: for balanced trees, our construction is polylogarithmic in the number of events. For the case when the partial order of events does not have a "nice" structure, and the events and their relation to previous events are revealed while the program is running, we give an online construction of covering families. Based on the construction, we develop a randomized scheduler called PCTCP that uniformly samples schedules from a covering family and has a rigorous guarantee of finding bugs of small depth. We experiment with an implementation of PCTCP on two real-world distributed systems—Zookeeper and Cassandra—and show that it can effectively find bugs.

Sharing-Aware Resource Management for Performance and Protection

Sandhya Dwarkadas
Department of Computer Science, University of Rochester
SWS Distinguished Lecture Series
02 May 2019, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
Recognizing that applications (whether in mobile, desktop, or server environments) are rarely executed in isolation today, I will discuss some practical challenges in making best use of available hardware and our approach to addressing these challenges. I will describe two independent and complementary control mechanisms using low-overhead hardware performance counters that we have developed: a sharing- and resource-aware mapper (SAM) to effect task placement with the goal of localizing shared data communication and minimizing resource contention based on the offered load; and an application parallelism manager (MAPPER) that controls the offered load with the goal of improving system parallel efficiency. If time permits, I will also outline our work on streamlining instruction memory management and address translation to eliminate redundancy and improve efficiency, especially in mobile environments. Our results emphasize the need for low-overhead monitoring of application behavior under changing environmental conditions in order to adapt to environment and application behavior changes.

Edge Computing in the Extreme and its Applications

Suman Banerjee
University of Wisconsin-Madison
SWS Colloquium
30 Apr 2019, 1:00 pm - 2:30 pm
Saarbrücken building E1 5, room 105
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312
The notion of edge computing introduces new computing functions away from centralized locations and closer to the network edge and thus facilitating new applications and services. This enhanced computing paradigm is provides new opportunities to applications developers, not available otherwise. In this talk, I will discuss why placing computation functions at the extreme edge of our network infrastructure, i.e., in wireless Access Points and home set-top boxes, is particularly beneficial for a large class of emerging applications. I will discuss a specific approach, called ParaDrop, to implement such edge computing functionalities, and use examples from different domains -- smarter homes, sustainability, and intelligent transportation -- to illustrate the new opportunities around this concept.

Querying Regular Languages over Sliding Windows

Moses Ganardi
University of Siegen
SWS Colloquium
29 Apr 2019, 2:30 pm - 3:30 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312
A sliding window algorithm for a language L receives a stream of symbols and has to decide at each time step whether the suffix of length n belongs to L or not. The window size n is either a fixed number (in the fixed-size model) or can be controlled online by an adversary (in the variable-size model). In this talk we give a survey on recent results for deterministic and randomized sliding window algorithms for regular languages.

The complexity of reachability in vector addition systems

Sylvain Schmitz
École Normale Supérieure Paris-Saclay
SWS Colloquium
05 Apr 2019, 2:00 pm - 3:00 pm
Saarbrücken building G26, room 111
simultaneous videocast to Kaiserslautern building E1 5, room 029 / Meeting ID: 6312
The last few years have seen a surge of decision problems with an astronomical, non primitive-recursive complexity, in logic, verification, and games. While the existence of such uncontrived problems is known since the early 1980s, the field has matured with techniques for proving complexity lower and upper bounds and the definition of suitable complexity classes. This framework has been especially successful for analysing so-called `well-structured systems'---i.e., transition systems endowed with a well-quasi-order, which underly most of these astronomical problems---, but it turns out to be applicable to other decision problems that resisted analysis, including two famous problems: reachability in vector addition systems and bisimilarity of pushdown automata. In this talk, I will explain how some of these techniques apply to reachability in vector addition systems, yielding tight Ackermannian upper bounds for the decomposition algorithm initially invented by Mayr, Kosaraju, and Lambert; this will be based on joint work with Jérôme Leroux.

Worst-Case Execution Time Guarantees for Runtime-Reconfigurable Architectures

Marvin Damschen
Karlsruhe Institute of Technology
SWS Colloquium
04 Apr 2019, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312
Real-time embedded systems need to be analyzable for execution time guarantees. Despite significant scientific advances, however, timing analysis lags years behind current microarchitectures with out-of-order scheduling pipelines, several hardware threads and multiple (shared)cache layers. To satisfy the increasing demand for predictable performance, analyzable performance features are required. We introduce runtime-reconfigurable instruction set processors as one way to escape the scarcity of analyzable performance features while preserving the flexibility of the system. To this end, we first present a reconfiguration controller for guaranteed reconfiguration delays of accelerators onto an FPGA. We propose a novel timing analysis approach to obtain worst-case execution time (WCET) guarantees for applications that utilize runtime-reconfigurable custom instructions (CIs), which each utilize one or more accelerators. Given the constrained reconfigurable area of an FPGA, we solve the problem of selecting CIs for each computational kernel of an application to optimize its worst-case execution time. Finally, we show that runtime reconfiguration provides the unique feature of optimized static WCET guarantees and optimization of the average-case execution time (maintaining statically-given WCET guarantees) by repurposing reconfigurable area for different selections of CIs at runtime.

The Server-to-Server Landscape: Insights, Opportunities, and Challenges

Balakrishnan Chandrasekaran
MPI-INF - D3
Joint Lecture Series
03 Apr 2019, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
A significant fraction of the Internet traffic today is delivered via content delivery networks (CDNs). Typically, the CDN hauls data from content providers to its back-end servers, moves this data (via an overlay) to its front-end servers, and serves the data from there to the end users. This server-to-server (i.e., back-end to front-end) landscape provides a few unique perspectives as well as opportunities for solving some long-standing networking problems. In this talk, I will briefly discuss these perspectives and opportunities, and present one key measurement effort that leverages this landscape—quantifying the effect of path choices in the Internet’s core on end-user’s experiences.

Analyzing Sample Correlations for Monte Carlo Rendering

Gurprit Singh
MPI-INF - D4
Joint Lecture Series
13 Mar 2019, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Point patterns and stochastic structures lie at the heart of Monte Carlo based numerical integration schemes. Physically based rendering algorithms have largely benefited from these Monte Carlo based schemes that inherently solve very high dimensional light transport integrals. However, due to the underlying stochastic nature of the samples, the resultant images are corrupted with noise (unstructured aliasing or variance). This also results in bad convergence rates that prohibit using these techniques in more interactive environments (e.g. games, virtual reality). With the advent of smart rendering techniques and powerful computing units (CPUs/GPUs), it is now possible to perform physically based rendering at interactive rates. However, much is left to understand regarding the underlying sampling structures and patterns which are the primary cause of error in rendering.  In this talk, we first revisit the most recent state-of-the-art frameworks that are developed to better understand the impact of samples’ structure on the error and its convergence during Monte Carlo integration. Towards the end, we briefly present our deep learning based approach to generate these samples with correlations.

Feedback-Control for Self-Adaptive Predictable Computing

Martina Maggio
Lund University
SWS Colloquium
13 Mar 2019, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Cloud computing gives the illusion of infinite computational capacity and allows for on-demand resource provisioning. As a result, over the last few years, the cloud computing model has experienced widespread industrial adoption and companies like Netflix offloaded their entire infrastructure to the cloud. However, with even the largest datacenter being of a finite size, cloud infrastructures have experienced overload due to overbooking or transient failures. In essence, this is an excellent opportunity for the design of control solutions, that tackle the problem of mitigating overload peaks, using feedback from the infrastructure. These solutions can then exploit control-theoretical principles and take advantage of the knowledge and the analysis capabilities of control tools to provide formal guarantees on the predictability of the infrastructure behavior. This talk introduces recent research advances on feedback control in the cloud computing domain, together with my research agenda for enhancing predictability and formal guarantees for cloud computing.

Automated Resource Management in Large-Scale Networked Systems

Sangeetha Abdu Jyothi
University of Illinois
SWS Colloquium
11 Mar 2019, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Internet applications rely on large-scale networked environments such as the cloud for their backend support. In these multi-tenanted environments, various stakeholders have diverse goals. The objective of the infrastructure provider is to increase revenue by utilizing the resources efficiently. Applications, on the other hand, want to meet their performance requirements at minimal cost. However, estimating the exact amount of resources required to meet the application needs is a difficult task, even for expert users. Easy workarounds employed for tackling this problem, such as resource over-provisioning, negatively impact the goals of the provider, applications, or both. In this talk, I will discuss the design of application-aware self-optimizing systems through automated resource management that helps meet the varied goals of the provider and applications in large-scale networked environments. The key steps in closed-loop resource management include learning of application resource needs, efficient scheduling of resources, and adaptation to variations in real time. I will describe how I apply this high-level approach in two distinct environments using (a) Morpheus in enterprise clusters, and (b) Patronus in cellular provider networks with geo-distributed micro data centers. I will also touch upon my related work in application-specific context at the intersection of network scheduling and deep learning. I will conclude with my vision for self-optimizing systems including fully automated clouds and an elastic geo-distributed platform for thousands of micro data centers.

Predictable Execution of Real-Time Applications on Many-Core Platforms

Matthias Becker
KTH Royal Institute of Technology
SWS Colloquium
08 Mar 2019, 10:30 am - 11:30 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
Nowadays, innovation in many industrial areas is software driven, where existing software functions become more complex and new software functions are constantly introduced. The rapid increase in functionality comes along with a steep increase in software complexity. To cope with this transition, current trends shift away from today’s distributed architectures towards integrated architectures. Here, previously distributed functionality is consolidated on fewer, more powerful, computers. Such a trend can for example be observed in the automotive or avionics domain. This can ease the integration process, reduce the hardware complexity, and ultimately save costs. One promising hardware platform for these powerful embedded computers is the many-core processor. A many-core processor hosts a vast number of compute cores, that are partitioned on clusters which are connected by a Network-on-Chip. However, ensuring that real-time requirements are satisfied in the presence of contention in shared resources, such as memories, remains an open issue. In addition, industrial applications are often subject to timing constraints on the data propagation through a chain of semantically related tasks. Such requirements pose challenges to the system designer as they are only able to verify them after the system synthesis (i.e. very late in the design process). In this talk, we present methods that transform timing constraints on the data propagation delay into precedence constraints between individual task instances. An execution framework for the cluster of the many-core is proposed that allows access to cluster external memory while it avoids contention on shared resources by design. Spatial and temporal isolation between different clusters is provided by a partitioning and configuration of the Network-on-Chip that further reduces the worst-case access times to external memory.

Mitigating data leaks in real world systems

Aastha Mehta
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
07 Mar 2019, 4:30 pm - 5:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Unintended data disclosures are a major concern for many online services, such as healthcare systems, government departments, and web services. Data may leak over explicit output channels of the systems, for instance due to accidental bugs and misconfigurations in the system. Data may also leak over various side channels, for instance, in a cloud environment where a tenant shares the Cloud provider’s infrastructure with other mutually distrusting tenants.

In this thesis, we address the problem of unintended data disclosures in web services due to both types of causes, i.e. explicit leaks and side channel leaks. Specifically, we propose a system to mitigate explicit leaks due to accidental bugs in database-backed services; and a system to mitigate network side channel leaks in the tenants of an infrastructure-as-a-service (IaaS) Cloud.

In this talk, I will first present a high level overview of the design, implementation, and evaluation of Qapla, which is a system to ensure policy compliance in database-backed services.

Then I will present our ongoing work on the design, implementation, and evaluation of Pacer, which is a system to mitigate network side channels in Cloud tenants. Pacer mitigates network side channels using traffic shaping. Pacer provides a generic abstraction of a traffic shaping tunnel, which encapsulates the tenant's network traffic, and shapes it to make it independent of the tenant's secrets. We present a prototype with Pacer's tunnel endpoints integrated in the Cloud hypervisor and the client OS. Our preliminary evaluation shows that Pacer can enforce traffic shaping securely, while incurring modest overheads on bandwidth, client latencies, and server throughput.

Privacy, Transparency and Trust in the User-Centric Internet

Oana Goga
Université Grenoble Alpes
SWS Colloquium
07 Mar 2019, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
The rise of user-centric Internet systems such as Facebook or Twitter brought security and privacy threats that became out of control in recent years. To make such systems more dependable, my research focuses on three key aspects: (1) privacy: ensure users understand and can control the information that is disclosed about them; (2) transparency: ensure users understand how their data is being used and how it affects the services they receive; and (3) trust: ensure users can evaluate the trustworthiness of content consumed from these systems. 

In this talk, I will share my research efforts in understanding and tackling security and privacy threats in social media targeted advertising. Despite a number of recent controversies regarding privacy violations, lack of transparency, or vulnerability to discrimination or propaganda by dishonest actors; users still have little understanding of what data targeted advertising platforms have about them and why they are shown the ads they see. To address such concerns, Facebook recently introduced the "Why am I seeing this?" button that provides users with an explanation of why they were shown a particular ad. I first investigate the level of transparency provided by this mechanism by empirically measuring whether it satisfies a number of key properties and what are the consequences of the current design choices. To provide a better understanding of the Facebook advertising ecosystem, we developed a tool called AdAnalyst that collects the ads users receive and provides aggregate statistics. I will then share our findings from analyzing data from over 600 real-world AdAnalyst users; in particular on who is advertising on Facebook and how these advertisers are targeting users and customizing ads via the platform. 

Towards Literate Artificial Intelligence

Mrinmaya Sachan
Carnegie Mellon University
SWS Colloquium
05 Mar 2019, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Over the past decade, the field of artificial intelligence (AI) has seen striking developments. Yet, today’s AI systems sorely lack the essence of human intelligence i.e.  our ability to (a) understand language and grasp its meaning, (b) assimilate common-sense background knowledge of the world, and (c) draw inferences and perform reasoning. Before we even begin to build AI systems that possess the aforementioned human abilities, we must ask an even more fundamental question: How would we even evaluate an AI system on the aforementioned abilities? In this talk, I will argue that we can evaluate AI systems in the same way as we evaluate our children - by giving them standardized tests. Standardized tests are administered to students to measure the knowledge and skills gained by them. Thus, it is natural to use these tests to measure the intelligence of our AI systems. Then, I will describe Parsing to Programs (P2P), a framework that combines ideas from semantic parsing and probabilistic programming for situated question answering. We used P2P to build systems that can solve pre-university level Euclidean geometry and Newtonian physics examinations. P2P achieves a performance at least as well as the average student on questions from textbooks, geometry questions from previous SAT exams, and mechanics questions from Advanced Placement (AP) exams. I will conclude by describing implications of this research and some ideas for future work.

A Client-centric Approach to Transactional Datastores

Natacha Crooks
University of Texas at Austin
SWS Colloquium
28 Feb 2019, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Modern applications must collect and store massive amounts of data. Cloud storage offers these applications simplicity: the abstraction of a failure-free, perfectly scalable black-box. While appealing, offloading data to the cloud is not without challenges. Cloud storage systems often favour weaker levels of isolation and consistency. These weaker guarantees introduce behaviours that, without care, can break application logic. Offloading data to an untrusted third party like the cloud also raises questions of security and privacy.

This talk summarises my efforts to improve the performance, the semantics and the security of transactional cloud storage systems. It centers around a simple idea: defining consistency guarantees from the perspective of the applications that observe these guarantees, rather than from the perspective of the systems that implement them. I will discuss how this new perspective brings forth several benefits. First, it offers simpler and cleaner definitions of weak isolation and consistency guarantees. Second, it enables more scalable implementations of existing guarantees like causal consistency. Finally, I will discuss its applications to security: our client-centric perspective allows us to add obliviousness guarantees to transactional cloud storage systems.

New Abstractions for High-Performance Datacenter Applications

Malte Schwarzkopf
MIT CSAIL
SWS Colloquium
18 Feb 2019, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Developing high-performance datacenter applications is complex and time-consuming today, as developers must understand and correctly implement subtle interactions between different backend systems. I describe a new approach that redesigns core datacenter systems around new abstractions: the right abstractions substantially reduce complexity while maintaining the same performance. This saves expensive developer time, uses datacenter servers more efficiently, and can enable new, previously impossible systems and applications. I illustrate the impact of such redesigns with Noria, which recasts web application backends-i.e., databases and caches-as a streaming dataflow computation based on a new abstraction of partial state. Noria's partially-stateful dataflow brings classic databases' familiar query flexibility to scalable dataflow systems, simplifying applications and improving the backend's efficiency. For example, Noria increases the request load handled by a single server by 5-70x compared to state-of-the-art backends. Additional new abstractions from my research increase the efficiency of other datacenter systems (e.g., cluster schedulers), or enable new kinds of systems that, for example, may help protect user data against exposure through application bugs.

Scalable positioning of commodity mobile devices using audio signals

Viktor Erdélyi
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
14 Feb 2019, 9:00 am - 10:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
This thesis explores the problem of computing a position map for co-located mobile devices. The positioning should happen in a scalable manner without requiring specialized hardware and without requiring specialized infrastructure (except basic Wi-Fi or cellular access). At events like meetings, talks, or conferences, a position map can aid spontaneous communication among users based on their relative position in two ways. First, it enables users to choose message recipients based on their relative position, which also enables the position-based distribution of documents. Second, it enables senders to attach their position to messages, which can facilitate interaction between speaker and audience in a lecture hall and enables the collection of feedback based on users’ location.

In this thesis, we present Sonoloc, a mobile app and system that, by relying on acoustic signals, allows a set of commodity smart devices to determine their relative positions. Sonoloc can position any number of devices within acoustic range with a constant number of acoustic signals emitted by a subset of devices. Our experimental evaluation with up to 115 devices in real rooms shows that – despite substantial background noise – the system can locate devices with an accuracy of tens of centimeters using no more than 15 acoustic signals.

Dynamic Symbolic Execution for Software Analysis

Cristian Cadar
Imperial College London
SWS Distinguished Lecture Series
07 Feb 2019, 10:30 am - 11:30 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
Symbolic execution is a program analysis technique that can automatically explore and analyse paths through a program. While symbolic execution was initially introduced in the seventies, it has only received significant attention during the last decade, due to tremendous advances in constraint solving technology and effective blending of symbolic and concrete execution into what is often called dynamic symbolic execution. Dynamic symbolic execution is now a key ingredient in many computer science areas, such as software engineering, computer security, and software systems, to name just a few. In this talk, I will discuss recent advances and ongoing challenges in the area of dynamic symbolic execution, drawing upon our experience developing several symbolic execution tools for many different scenarios, such as high-coverage test input generation, bug and security vulnerability detection, patch testing and bounded verification, among many others.

Machine Teaching

Adish Singla
Max Planck Institute for Software Systems
Joint Lecture Series
06 Feb 2019, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Much of the research in machine learning has focused on designing algorithms for discovering knowledge from data and learning a target model. What if there is a ``teacher" who knows the target already, and wants to steer the ``learner" towards this target as quickly as possible?  For instance, in an educational setting, the teacher (e.g., a tutoring system) has an educational goal that she wants to communicate to a student via a set of demonstrations and lessons; in adversarial attacks known as training-set poisoning, the teacher (e.g., a hacking algorithm) manipulates the behavior of a machine learning system by maliciously modifying the training data. This lecture will provide an overview of machine teaching and cover the following three aspects: (i) how machine teaching differs from machine learning, (ii) highlight the problem space of machine teaching across different dimensions, and (iii) discuss our recent work on developing teaching algorithms for human learners.

Techniques to Protect Confidentiality and Integrity of Persistant and In-Memory Data

Anjo Vahldiek-Oberwagner
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
05 Feb 2019, 5:30 pm - 6:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Today computers store and analyze valuable and sensitive data. As a result we need to protect this data against confidentiality and integrity violations that can result in the illicit release, loss, or modification of a user’s and an organization’s sensitive data such as personal media content or client records. Existing techniques protecting confidentiality and integrity lack either efficiency or are vulnerable to malicious attacks. In this thesis we suggest techniques, Guardat and ERIM, to efficiently and robustly protect persistent and in-memory data. To protect the confidentiality and integrity of persistent data, clients specify per-file policies to Guardat declaratively, concisely and separately from code. Guardat enforces policies by mediating I/O in the storage layer. In contrast to prior techniques, we protect against accidental or malicious circumvention of higher software layers. We present the design and prototype implementation, and demonstrate that Guardat efficiently enforces example policies in a web server. To protect the confidentiality and integrity of in-memory data, ERIM isolates sensitive data using Intel Memory Protection Keys (MPK), a recent x86 extension to partition the address space. However, MPK does not protect against malicious attacks by itself. We prevent malicious attacks by combining MPK with call gates to trusted entry points and ahead-of-time binary inspection. In contrast to existing techniques, ERIM efficiently protects frequently-used session keys of web servers, an in-memory reference monitor’s private state, and managed runtimes from native libraries. These use cases result in high switch rates of the order of 10 5 –10 6 switches/s. Our experiments demonstrate less then 1% runtime overhead per 100,000 switches/s, thus outperforming existing techniques.

Discrimination in Algorithmic Decision Making: From Principles to Measures and Mechanisms

Bilal Zafar
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
04 Feb 2019, 6:00 pm - 7:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
The rise of algorithmic decision making in a variety of applications has also raised concerns about its potential for discrimination against certain social groups. However, incorporating nondiscrimination goals into the design of algorithmic decision making systems (or, classifiers) has proven to be quite challenging. These challenges arise mainly due to the computational complexities involved in the process, and the inadequacy of existing measures to computationally capture discrimination in various situations. The goal of this thesis is to tackle these problems.

First, with the aim of incorporating existing measures of discrimination (namely, disparate treatment and disparate impact) into the design of well-known classifiers, we introduce a mechanism of decision boundary covariance, that can be included in the formulation of any convex boundary-based classifier in the form of convex constraints. Second, we propose alternative measures of discrimination. Our first proposed measure, disparate mistreatment, is useful in situations when unbiased ground truth training data is available. The other two measures, preferred treatment and preferred impact, are useful in situations when feature and class distributions of different social groups are significantly different, and can additionally help reduce the cost of nondiscrimination (as compared to the existing measures). We also design mechanisms to incorporate these new measures into the design of convex boundary-based classifiers.

How to Win a First-Order Safety Game

Helmut Seidl
TUM
SWS Distinguished Lecture Series
01 Feb 2019, 10:30 am - 11:30 am
Kaiserslautern building G26, room 111
First-order (FO) transition systems have recently attracted attention for the verification of parametric systems such as network protocols, software-defined networks or multi-agent workflows. Desirable properties of these systems such as functional correctness or non-interference have conveniently been formulated as safety properties. Here, we go one step further. Our goal is to verify safety, and also to develop techniques for automatically synthesizing strategies to enforce safety. For that reason, we generalize FO transition systems to FO safety games. We prove that the existence of a winning strategy of safety player in finite games is equivalent to second-order quantifier elimination. For monadic games, we provide a complete classification into decidable and undecidable cases. For games with non-monadic predicates, we concentrate on universal invariants only. We identify a non-trivial sub-class where safety is decidable. For the general case, we provide meaningful abstraction and refinement techniques for realizing a CEGAR based synthesis loop. Joint work with: Christian Müller, TUM Bernd Finkbeiner, Universität des Saarlandes

Automated Complexity Analysis of Rewrite Systems

Florian Frohn
RWTH Aachen
SWS Colloquium
22 Jan 2019, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
Many approaches to analyze the complexity of programs automatically are based on transformations into integer or term rewrite systems. However, state-of-the-art tools that analyze the worst-case complexity of rewrite systems are restricted to the inference of upper bounds. In this talk, the first techniques for the inference of lower bounds on the worst-case complexity of integer and term rewrite systems are introduced. While upper bounds can prove the absence of performance-critical bugs, lower bounds can be used to find them.

For term rewriting, the power of the presented technique gives rise to the question whether the existence of a non-constant lower bound is decidable. Thus, the corresponding decidability results are also discussed in this talk.

Finally, to see the practical value of complexity analysis techniques for rewrite systems, we will have a glimpse at the transformation from Java programs to integer rewrite systems that is implemented in the tool AProVE.

Quantifying & Characterizing Information Diets of Social Media Users

Juhi Kulshrestha
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
20 Dec 2018, 4:00 pm - 5:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
An increasing number of people are relying on online social media platforms like Twitter and Facebook to consume news and information about the world around them. This change has led to a paradigm shift in the way news and information is exchanged in our society – from traditional mass media to online social media.

With the changing environment, it’s essential to study the information consumption of social media users and to audit how automated algorithms (like search and recommendation systems) are modifying the information that social media users consume. In this thesis, we fulfill this high-level goal with a two-fold approach. First, we propose the concept of information diets as the composition of information produced or consumed. Next, we quantify the diversity and bias in the information diets that social media users consume via the three main consumption channels on social media platforms: (a) word of mouth channels that users curate for themselves by creating social links, (b) recommendations that platform providers give to the users, and (c) search systems that users use to find interesting information on these platforms. We measure the information diets of social media users along three different dimensions of topics, geographic sources, and political perspectives.

Our work is aimed at making social media users aware of the potential biases in their consumed diets, and at encouraging the development of novel mechanisms for mitigating the effects of these biases.

Language dynamics in social media

Animesh Mukherjee
Indian Institute of Technology, Kharagpur
SWS Colloquium
13 Dec 2018, 10:30 am - 11:30 am
Kaiserslautern building G26, room 113
simultaneous videocast to Saarbrücken building E1 5, room 105
In this talk I shall outline a summary of our five year long initiative studying the temporal dynamics of various human language-like entities over the social media. Some of the topics that I plan to cover are (a)  how opinion conflicts could be effectively used for incivility detection in Twitter [CSCW 2018], (b) how word borrowings can be automatically identified from social signals [EMNLP 2017] and (c)  how hashtags in Twitter form compounds like natural language words (e.g., #Wikipedia+#Blackout=#WikipediaBlackout) that become way more popular than the individual constituent hashtags [CSCW 2016, Honorable Mention].

Post-quantum Challenges in Secure Computation

Nico Döttling
CISPA
Joint Lecture Series
05 Dec 2018, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
In the early 1990s cryptography went into a foundational crisis when efficient quantum algorithms were discovered which could break almost all public key encryption schemes known at the time. Since then, a enormous research effort has been invested into basing public key cryptography, and secure computation in general, on problems which are conjectured to be hard even for quantum computers. While this research program has been resoundingly successful, even leading up the way to cryptographic milestones such as fully homomorphic encryption, there are still important cryptographic primitives for which no post-quantum secure protocols are known. Until very recently, one such primitive was 2-message oblivious transfer, a fundamental primitive in the field of secure two- and multi-party computation. I will discuss a novel construction of this primitive from the Learning With Errors (LWE) assumption, a lattice-based problem which is known to be as hard as worst-case lattice problems and conjectured to be post-quantum secure. The security of our construction relies on a fundamental Fourier-analytic property of lattices, namely the transference principle: Either a lattice or its dual must have short vectors.

Privacy-Compliant Mobile Computing

Paarijaat Aditya
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
03 Dec 2018, 4:30 pm - 5:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Sophisticated mobile computing, sensing and recording devices like smartphones, smartwatches, and wearable cameras are carried by their users virtually around the clock, blurring the distinction between the online and offline worlds. While these devices enable transformative new applications and services, they also introduce entirely new threats to users' privacy, because they can capture a complete record of the user's location, online and offline activities, and social encounters, including an audiovisual record. Such a record of users' personal information is highly sensitive and is subject to numerous privacy risks. In this thesis, we have investigated and built systems to mitigate two such privacy risks: 1) privacy risks due to ubiquitous digital capture, where bystanders may inadvertently be captured in photos and videos recorded by other nearby users, 2) privacy risks to users' personal information introduced by a popular class of apps called `mobile social apps'. In this thesis, we present two systems, called I-Pic and EnCore, built to mitigate these two privacy risks.

Both systems aim to put the users back in control of what personal information is being collected and shared, while still enabling innovative new applications. We built working prototypes of both systems and evaluated them through actual user deployments. Overall we demonstrate that it is possible to achieve privacy-compliant digital capture and it is possible to build privacy-compliant mobile social apps, while preserving their intended functionality and ease-of-use. Furthermore, we also explore how the two solutions can be merged into a powerful combination, one which could enable novel workflows for specifying privacy preferences in image capture that do not currently exist.

Survey Equivalence: An Information-theoretic Measure of Classifier Accuracy When the Ground Truth is Subjective

Paul Resnick
University of Michigan, School of Information
SWS Distinguished Lecture Series
27 Nov 2018, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111
Many classification tasks have no objective ground truth. Examples include: which content or explanation is "better" according to some community? is this comment toxic? what is the political leaning of this news article? The traditional modeling approach assumes each item has an objective true state that is perceived by humans with some random error. It fails to account for the fact that people have greater agreement on some items than others. I will describe an alternative model where the true state is a distribution over labels that raters from a specified population would assign to an item. This leads to information gain (mutual information) as a theoretically justified and computationally tractable measure of a classifier's quality, and an intuitive interpretation of information gain in terms of the sample size for a survey that would yield the same expected error rate.

The Reachability Problem for Vector Addition Systems is Not Elementary

Wojciech Czerwiński
University of Warsaw
SWS Colloquium
22 Nov 2018, 4:00 pm - 5:00 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
I will present a recent non-elementary lower bound for the complexity of reachability problem for Vector Addition Systems. I plan to show the main insights of the proof. In particular I will present a surprising equation on fractions, which is the core of the new source of hardness found in VASes.

More Realistic Scheduling Models and Analyses for Advanced Real-Time Embedded Systems

Georg von der Brueggen
TU Dortmund
SWS Colloquium
22 Nov 2018, 2:30 pm - 3:30 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
In real-time embedded systems, for each task the compliance to timing constraints has to be guaranteed in addition to the functional correctness. The first part of the talk considers the theoretical comparison of scheduling algorithms and schedulability tests by evaluating speedup factors for non-preemptive scheduling, which leads to a discussion about general problems of resource augmentation bounds. In addition, it is explained how utilization bounds can be parametrized, resulting in better bounds for specific scenarios, i.e., when analyzing non-preemptive Rate-Monotonic scheduling as well as task sets inspired by automotive applications.

In the second part, a setting similar to mixed-criticality systems is considered and the criticism on previous work in this area is detailed. Hence, a new system model that allows a better applicability to realistic scenarios, namely Systems with Dynamic Real-Time Guarantees, is explained. This model is extended to a multiprocessor scenario, considering CPU overheating as a possible cause for mixed-criticality behaviour. Finally, a way to determine the deadline-miss probability for such systems is described that drastically reduces the runtime of such calculations.

The third part discusses tasks with self-suspension behaviour, explains a fixed-relative-deadline strategy for segmented self-suspension tasks with one suspension interval, and details how this approach can be exploited in a resource-oriented partitioned scheduling. Furthermore, it is explained how the gap between the dynamic and the segmented self-suspension model can be bridged by hybrid models.

Verified Secure Routing

Peter Müller
ETH Zurich
SWS Distinguished Lecture Series
19 Nov 2018, 10:30 am - 11:30 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
SCION is a new Internet architecture that addresses many of the security vulnerabilities of today’s Internet. Its clean-slate design provides, among other properties, route control, failure isolation, and multi-path communication. The verifiedSCION project is an effort to formally verify the correctness and security of SCION. It aims to provide strong guarantees for the entire architecture, from the protocol design to its concrete implementation. The project uses stepwise refinement to prove that the protocol withstands increasingly strong attackers. The refinement proofs assume that all network components such as routers satisfy their specifications. This property is then verified separately using deductive program verification in separation logic. This talk will give an overview of the verifiedSCION project and explain, in particular, how we verify code-level properties such as memory safety, I/O behavior, and information flow security.