# Events 2015

### The CakeML verified compiler

Scott Owens
University of Kent
SWS Colloquium
17 Dec 2015, 1:00 pm - 2:30 pm
Saarbrücken building E1 5, room 029
CakeML is a new ML dialect aimed at supporting formally verified programs. The CakeML project has several aspects including formal semantics and metatheory, a verified compiler, a formal connection between its semantics and higher-order logic (in the HOL4 interactive theorem prover), and example verified applications written in CakeML and HOL4. The project is an active collaboration between Ramana Kumar at NICTA, Magnus Myreen at Chalmers, Michael Norrish at NICTA, Yong Kiam Tan at (A*STAR, Singapore), and myself.

In this talk, I will explain the architecture of CakeML's verified compiler, focussing on a new optimising backend that we are currently developing.

CakeML's web site is https://cakeml.org, and development is hosted on GitHub at https://github.com/CakeML/cakeml.

### Rigorous Acrchitectural Modelling for Production Multiprocessors

Kathy Gray
University of Cambridge
SWS Colloquium
16 Dec 2015, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
Processor architectures are critical interfaces in computing, but they are typically defined only by prose and pseudocode documentation. This is especially problematic for the subtle concurrency behaviour of weakly consistent multiprocessors such as ARM and IBM POWER: the traditional documentation does not define precisely what programmer-observable behaviour is (and is not) allowed for concurrent code; it is not executable as a test oracle for pre-Silicon or post-Silicon hardware testing; it is not executable as an emulator for software testing; and it is not mathematically rigorous enough to serve as a foundation for software verification.

In this talk, I will present a rigorous architectural envelope model for IBM POWER and ARM multiprocessors, that aims to support all of these for small-but-intricate test cases, integrating an operational concurrency model with an ISA model for the sequential behaviour of a substantial fragment of the user-mode instruction set (expressed in a new ISA description language). I will present the interface between the two, whose requirements drove the development of our new language Sail. I will also present the interesting aspects of Sail's dependent type system, which is a light-weight system balancing the benefits of static bounds and effects checking with the usability of the language for engineers. Futher, our models can be automatically translated into executable code, which, combined with front-ends for concurrency litmus tests and ELF executables, can interactively or exhaustively explore all the allowed behaviours of small test cases.

Joint work with: S. Flur, G. Kerneis*, L. Maranget**, D. Mulligan, C. Pulte, S. Sarkar***, and P. Sewell, University of Cambridge (* Google, ** Inria, *** St Andrews)

### The C standard formalized in Coq

Robbert Krebbers
Aarhus University
SWS Colloquium
15 Dec 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
In my PhD thesis I have developed a formal semantics of a significant part of the C programming language as described by the C11 standard. In this talk I will give a brief overview of the main parts of my formalized semantics.

* A structured memory model based on trees to capture subtleties of C11 that have not been addressed by others.

* A core C language with a small step operational semantics. The operational semantics is non-deterministic due to unspecified expression evaluation order.

* An explicit type system for the core language that enjoys type preservation and progress.

* A type sound translation of actual C programs into the core language.

* An executable semantics that has been proven sound and complete with respect to the operational semantics.

* Extensions of separation logic to reason about subtle constructs in C like non-determinism in expressions, and gotos in the presence of block scope variables.

### CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling

Aritra Dhar
Xerox Research Center India
SWS Colloquium
08 Dec 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 113
Software is susceptible to malformed data originating from untrusted sources. Occasionally the programming logic or constructs used are inappropriate to handle the varied constraints imposed by legal and well-formed data. Consequently, software may produce unexpected results or even crash. In this paper, we present \tool, a novel hybrid approach that saves such software from crashing when failures originate from malformed strings or inappropriate handling of strings. Clotho statically analyzes a program to identify statements that are vulnerable to failures related to associated string data. Clotho then generates patches that are likely to satisfy constraints on the data, and in case of failures produces program behavior which would be close to the expected. The precision of the patches is improved with the help of a dynamic analysis.

We have implemented Clotho for the Java String API, and our evaluation based on several popular open-source libraries shows that Clotho generates patches that are semantically similar to the patches generated by the programmers in the later versions. Additionally, these patches are activated only when a failure is detected, and thus Clotho incurs no runtime overhead during normal execution, and negligible overhead in case of failures.

### Fairness and Transparency in Data-Driven Decision Making Systems

Max Planck Institute for Software Systems
Joint Lecture Series
02 Dec 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
tbd

### Impact of Multicore on Cyber-Physical Systems: challenges andsolutions

Dr. Marco Caccamo
University of Illinois
SWS Distinguished Lecture Series
26 Nov 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111
The benefits of adopting emerging multicore processors include reductions in space, weight, power, and cooling, while increasing CPU bandwidth per processor. However, the existing real-time software engineering process is based on the constant worst case execution time (WCET) assumption, which states that the measured worst case execution time of a software task when executed alone is the same as when that task is running together with other tasks. While this assumption is correct for single-core chips, it is NOT true for multicore chips. As it is now, the interference between cores can cause delay spikes as high as 600% in industry benchmarks. This presentation reviews main challenges faced by the embedded industry today when adopting multicore in safety critical embedded systems. A discussion on the notion of Single Core Equivalence follows.

### Declarative Programming for Eventual Consistency

Dr. Suresh Jagannathan
Purdue University,
SWS Colloquium
19 Nov 2015, 10:30 am - 20 Nov 2015, 10:30 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
n geo-replicated distributed data stores, the need to ensure responsiveness in the face of network partitions and processor failures results in implementations that provide only weak (so-called eventually consistent) guarantees on when data updated by one process becomes visible to another. Applications must be carefully constructed to be aware of unwanted inconsistencies permitted by such implementations (e.g., having negative balances in a bank account, or having an item appear in a shopping cart after it has been removed), but must balance correctness concerns with performance and scalability needs. Because understanding these tradeoffs requires subtle reasoning and detailed knowledge about the underlying data store, implementing robust distributed applications in such environments is often an error-prone and expensive task.

### Binary Search Trees, Precognition, and Patterns

Parinya Chalermsook
MPI-INF - D1
Joint Lecture Series
11 Nov 2015, 12:15 pm
Saarbrücken building E1 5, room 002
Two challenges in real-world optimization problems are computational intractability (many problems are NP-hard), and uncertainty (input data are incomplete or change over time). In this talk, we will see a problem that encapsulates both challenges: What is the asymptotically best binary search tree (BST) data structure that responds to "real-time" input data? BSTs have played huge roles in computer science, both in research and in curriculum, so it is not a surprise that this problem is considered a flagship open problem in data structure. Resolving the problem is likely to impact the field of algorithms as a whole. Its difficulty is witnessed by the fact that even highly restricted special cases have remained unresolved for 3 decades. I will report our recent progresses that, despite not resolving the problem, leapfrogged some long-standing barriers. The talk will be accessible by general audience: I will highlight the "conceptual" messages of our results, rather than technical/mathematical prowess.

The talk is based on joint work with Mayank Goswami, Laszlo Kozma, Kurt Mehlhorn, and Thatchaphol Saranurak.

### Validating Optimizations of Concurrent C/C++ Programs

Soham Chakraborty
Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
30 Oct 2015, 3:00 pm - 4:00 pm
Kaiserslautern building G26, room 111
We present a validator for checking the correctness of LLVM compiler optimizations on C11 programs as far as concurrency is concerned. Our validator checks that optimizations do not change memory accesses in ways disallowed by the C11 and/or LLVM memory models. We use a custom C11 concurrent program generator to trigger multiple LLVM optimizations and evaluate the efficacy of our validator. Our experiments highlighted the difference between the C11 and LLVM memory models, and uncovered a number of previously unknown compilation errors in the LLVM optimizations involving the C11 concurrency primitives. More details can be found at http://plv.mpi-sws.org/validc/

### Discrimination Data Analysis

Salvatore Ruggieri
University of Pisa
SWS Colloquium
22 Oct 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 113
The collection and analysis of observational and experimental data represent the main tools for assessing the presence, the extent, the nature, and the trend of discrimination phenomena. Data analysis techniques have been proposed in the last fifty years in the economic, legal, statistical, and, recently, in the data mining literature. This is not surprising, since discrimination analysis is a multi-disciplinary problem, involving sociological causes, legal argumentations, economic models, statistical techniques, computational issues. The objective of the talk is to provide first an introduction on concepts, problems, application areas, datasets, methods and approaches from a multidisciplinary perspective; and then to deep in the data-driven approach based on data mining for discrimination discovery and prevention. Reference: · A. Romei, S. Ruggieri. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review. Vol. 29, Issue 5, November 2014, 582-638.

### When is CAN the Weakest Link? A Bound on Failures-In-Time in CAN-Based Real-Time Systems

Arpan Gujarati
Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
20 Oct 2015, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 111
I discuss a method we proposed to bound the Failures In Time (FIT) rate of a CAN-based real-time system, i.e., the expected number of failures in one billion operating hours. The method is based on an analysis of the probability of a correct and timely message transmission despite host and network failures due to electromagnetic interference (EMI), which we derived earlier. For a given workload, the derived FIT rate can be used to find an optimal replication factor, which is demonstrated with a case study based on a message set taken from a simple mobile robot.

### Complexity of the Scheduling Problem for Periodic Real-Time Tasks

Pontus Ekberg
Uppsala University, Sweden
SWS Colloquium
20 Oct 2015, 1:30 pm
Kaiserslautern building G26, room 113
simultaneous videocast to Saarbrücken building E1 5, room 029
In real-time scheduling theory we are interested in finding out whether a set of repeatedly activated computational tasks can be co-executed on a shared computer platform, such that all of their deadlines are met. The periodic and sporadic task models are among the most basic formalisms used for modeling computational tasks. Among computer platforms considered, the preemptive uniprocessor is one of the simplest. To decide whether a given set of periodic or sporadic tasks can be scheduled on a preemptive uniprocessor so that all deadlines are met is therefore a core problem in real-time scheduling theory. Still, the complexity of this decision problem has long been open. In this talk, which is targeted to a general audience, I will outline some recent results pinpointing this complexity.

### FSL: A Program Logic for C11 Memory Fences

Marko Doko
Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
16 Oct 2015, 1:00 pm - 2:00 pm
Kaiserslautern building G26, room 111
In this talk, we'll describe a simple, but powerful, program logic for reasoning about C11 relaxed accesses used in conjunction with release and acquire memory fences. Our logic, called fenced separation logic (FSL), extends relaxed separation logic with special modalities for describing state that has to be protected by memory fences. Like its precursor, FSL allows ownership transfer over synchronizations and can be used to verify the message-passing idiom and other similar programs. The soundness of FSL has been established in Coq.

### Spin Locks in Real-Time Systems

Alexander Wieder
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
16 Oct 2015, 12:00 pm
Kaiserslautern building G26, room 113
simultaneous videocast to Saarbrücken building E1 5, room 029
In my thesis, I propose to study two fundamental problems related to the use of spin locks in multiprocessor real-time systems: First, if tasks (on different cores) share spin-lock-protected resources, what is (a safe bound on) the maximum blocking, i.e., the maximum spin delays, that may arise at runtime? And second, how should tasks be allocated to processor cores such that all timing constraints are met despite such blocking?

### Challenges in Image-Based 3D Reconstructions

Roland Angst
MPI-INF - D4
Joint Lecture Series
07 Oct 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Driven by the needs for various applications such as robotics, immersive augmented and virtual reality, digitization of archeological sites and landmarks, medical imaging, etc., the extraction of 3D geometry from images has become increasingly important in the last couple of years. The theory of multiple view geometry which relates images from different viewpoints dates back more than 100 years. However, in practice, e.g. due to imperfections of cameras or measuring noise, the required assumptions for this theory are often not met exactly which makes 3D computer vision inherently difficult. In my talk, I will first outline some of the challenges we are faced with and in the second part, I will focus on two of those challenges. Specifically, we will look into radial distortion estimation without calibration targets and dense 3D reconstructions for scenes where the rigidity assumption is violated. We will see how simple and very intuitive reasoning in geometric terms can provide the foundation for algorithms to tackle those challenges.

### Trustworthy File Systems

Christine Rizkallah
NICTA
SWS Colloquium
21 Sep 2015, 1:00 pm - 2:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 113
In this talk, I will present an approach to ease the verification of file-system code using a domain-specific language, currently called CoGent, supported by a self-certifying compiler that produces C code, a high-level specification, and translation correctness proofs.

CoGent is a restricted, polymorphic, higher-order, and purely functional language with linear types and without the need for a trusted runtime or garbage collector. It compiles to efficient C code that is designed to interoperate with existing C functions.

For a well-typed CoGent program, the compiler produces C code, a high-level shallow embedding of its semantics in Isabelle/HOL, and a proof that the C code correctly implements this embedding. The aim is for proof engineers to reason about the full semantics of real-world systems code productively and equationally, while retaining the interoperability and leanness of C.

I will give a high-level overview of the formal verification stages of the compiler, which include automated formal refinement calculi, a switch from imperative update semantics to functional value semantics formally justified by the linear type system, and a number of standard compiler phases such as type checking and monomorphisation. The compiler certificate is a series of language-level meta proofs and per-program translation validation phases, combined into one coherent top-level theorem in Isabelle/HOL.

### A Quantifier- Elimination Heuristic for Octagonal Constraints

Deepak Kapur
University of New Mexico
SWS Colloquium
18 Sep 2015, 2:00 pm
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbrücken building E1 5, room 029
Octagonal constraints expressing weakly relational numerical properties of the form ($l \le \pm x \pm y \leq h$) have been found useful and effective in static analysis of numerical programs. Their analysis serves as a key component of the tool ASTREE based on abstract interpretation framework proposed by Patrick Cousot and his group, which has been successfully used to analyze commercial software consisting of hundreds of thousand of lines of code. This talk will discuss a heuristic based on the quantifier-elimination (QE) approach presented by Kapur (2005) that can be used to automatically derive loop invariants expressed as a conjunction of octagonal constraints in $O(n^2)$, where $n$ is the number of program variables over which these constraints are expressed. This is in contrast to the algorithms developed in Mine's thesis which have the complexity at least $O(n^3)$. The restricted QE heuristic usually generates invariants stronger than those obtained by the freely available Interproc tool. Extensions of the proposed approach to generating disjunctive invariants will be presented.

### Making Theorem Provers Easier to Use

Jasmin Christian Blanchette
MPI-INF - RG 1
Joint Lecture Series
02 Sep 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Theorem provers are computer programs that help prove logical formulas, whether they arise in mathematics, in computer science, or elsewhere. The theorem proving community divided in two. Roughly speaking: (1) The automatic subcommunity develops automatic theorem provers (ATPs; e.g., SPASS, Z3), with the very ambitious goal of replacing mathematicians by machines. (2) The interactive subcommunity develops proof assistants (e.g., Coq, Isabelle), with the goal of replacing LaTeX proofs by computer-checked documents.

In this talk, I present work on integrating ATPs in proof assistants and on enriching ATPs with features typical of proof assistants. I argue in favor of a tighter integration of the two communities, to make theorem proving easier, more productive, more enjoyable, and more useful.

### Monoids and Invariants as an Orthogonal Basis for Concurrent Reasoning

Ralf Jung
Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
18 Aug 2015, 1:00 pm - 2:00 pm
Saarbrücken building E1 5, room 029
We present Iris, a concurrent separation logic with a simple premise: Monoids and invariants are all you need. Partial commutative monoids enable us to express--and invariants enable us to enforce--user-defined protocols on shared state, which are at the conceptual core of most recent program logics for concurrency. Furthermore, through a novel extension of the concept of a view shift, Iris supports the encoding of logically atomic specifications, i.e., Hoare-style specs that permit the client of an operation to treat the operation essentially as if it were atomic, even if it is not.

### Perceptually-driven Inputs for New Output Devices

Piotr Didyk
Cluster of Excellence - Multimodal Computing and Interaction - MMCI
Joint Lecture Series
05 Aug 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
There has been a tremendous increase in quality and number of new output devices, such as stereo and automultiscopic screens, portable and wearable displays, and 3D printers. Some of them have already entered the mass production and gained a lot of users' attention, others will follow this trend promptly. Unfortunately, abilities of these emerging technologies outperform capabilities of methods and tools for creating content. Also, the current level of understanding of how these new technologies influence user experience is insufficient to fully exploit their advantages. In this talk, I will demonstrate that careful combinations of new hardware, computation, and models of human perception can lead to solutions that provide significant increase in perceived quality. More precisely, I will show how careful rendering of frames can improve spatial and temporal resolution as well as reduce temporal artifacts of video content without sacrificing its cinematic look. Next, I will discuss techniques for overcoming limitations of current 3D displays. In the context of 3D printing, I will discuss methods for specifying objects for 3D printing as well as ongoing efforts in making these techniques aware of human perception.

### Fast Approximate Max Flow Computation

Christoph Lenzen
MPI-INF - D1
Joint Lecture Series
01 Jul 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
The max flow problem is one of the most fundamental and well-studied network problems. In this talk, I will revisit it from the perspective of distributed algorithms. For decades, the million-dollar question here has been whether there *is* a fast distributed algorithm or one should rather collect all information at a single location and apply an old school centralized algorithm. On the way to answering this question, we will get to see several of the brilliant ideas that have come up recently on this evergreen among the classic graph problems. The talk is tailored to a general CS audience; don't worry if you're not familiar with the topic - come to enjoy the show!

### ''Timing-Aware Control Software Design for Automotive Systems''

Dr. Arne Hamann
Robert Bosch GmbH
SWS Colloquium
26 Jun 2015, 10:30 am - 12:00 pm
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbrücken building E1 5, room 029
The underlying theories of both control engineering and real-time systems engineering assume idealized system abstractions that mutually neglect central aspects of the other discipline. Control engineering theory, on the one hand, usually assumes jitter free sampling and constant input-output latencies disregarding complex real-world timing effects. Real-time engineering theory, on the other hand, uses abstract performance models that neglect the functional behavior, and derives worst-case situations that have little expressiveness for control functionalities in physically dominated automotive systems. As a consequence, there is a lot of potential for a systematic co-engineering between both disciplines, increasing design efficiency and confidence. In this talk, possible approaches for such a co-engineering and their current applicability to real world problems are discussed. In particular, simulation-based and formal verification techniques are compared for different construction principles of automotive real-time control software.

### Effective Testing for Concurrency Bugs

Pedro Fonseca
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
24 Jun 2015, 1:00 pm - 25 Jun 2015, 1:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
In the current multi-core era, concurrency bugs are a serious threat to software reliability. As hardware becomes more parallel, concurrent programming will become increasingly pervasive. However, correct concurrent programming is known to be extremely challenging for developers and can easily lead to the introduction of concurrency bugs. This talk addresses this challenge by proposing novel techniques to help developers expose and detect concurrency bugs. First, I will briefly present the results of a bug study that analyzed the external and internal effects of real-world concurrency bugs. This study revealed that a significant fraction of concurrency bugs qualify as semantic or latent bugs, which are two particularly challenging classes of concurrency bugs. Based on the insights from the study, I will present a concurrency bug detector, PIKE, that analyzes the behavior of program executions to infer whether concurrency bugs have been triggered during a concurrent execution. In addition, I will present the design of a testing tool, SKI, that allows developers to test operating system kernels for concurrency bugs in a practical manner. SKI bridges the gap between user-mode testing and kernel-mode testing by enabling the systematic exploration of the kernel thread interleaving space. Both PIKE and SKI have been shown to be effective at finding concurrency bugs.

### On Leveraging the Wisdom of Crowdsourced Experts

Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
24 Jun 2015, 1:00 pm - 2:00 pm
Saarbrücken building E1 5, room 029
Recently, online social networks (OSNs) such as Twitter and Facebook have emerged as popular platforms for exchanging information on the Web. With hundreds of millions of users posting content ranging from everyday conversations to real-time news and from product reviews to information about a diverse range of topics like politics and diabetes, OSNs truly provide a way to tap into the wisdom of crowds. In fact, content streams generated by Twitter users are being used for several applications like content search and recommendation, breaking news detection and business analytics. However, analyzing content streams generated on Twitter poses two important research challenges: (1) processing a large stream of around 500 million tweets per day in real-time is often not scalable and (2) since OSN users have pseudo-anonymous identities, reasoning about quality and trustworthiness of content generated by these users becomes increasingly difficult as shown by prior studies.

In this work, we address these two challenges by proposing a novel data sampling methodology: relying on the wisdom of crowdsourced experts. That is, instead of processing all the tweets posted in the Twitter network, we only rely on tweets from a handful of expert users. Since tweets posted by these expert users constitute only a small fraction of all the tweets in the network, using this expert tweet stream (or, expert sample) helps overcome scalability issues related to real-time data processing. Comparing the expert sample to another widely used sampling methodology (namely, random sampling) reveals that expert sampling has numerous potential advantages for data mining and content retrieval tasks such as content search, real-time event detection, product sentiment analysis etc.

To show the utility of the expert stream for content-centric applications, we compare Twitter search functionality implemented over the whole Twitter stream (or, crowd stream) to one implemented over the expert stream only. Surprisingly, despite being two orders of magnitude smaller, the expert stream captures most of the relevant information posted by the whole Twitter crowd. Moreover, search results from expert stream are of significantly better quality and contain far fewer spam posts as compared to crowd results. Our findings add another dimension to longstanding crowds vs. experts debate by concluding that wisdom of experts is better than wisdom of crowds in the context of certain content-centric applications. These findings have serious implications for the design of future content retrieval systems.

### Types for Incremental Computational Complexity

Deepak Garg
Max Planck Institute for Software Systems
Joint Lecture Series
03 Jun 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
The area of incremental computation seeks to reduce a program's execution time by re-using the trace of a prior execution of the program intelligently. With recent advances, programs can be compiled to their incremental versions automatically. However, thus far, there is no language-level support for formally proving the time complexity of incremental execution. Motivated by this gap, this talk presents CostIt, a higher-order functional language equipped with a type theory for proving asymptotic bounds on incremental computation time.  CostIt uses type refinements to specify which parts of inputs and outputs may change, as well as the program's incremental execution time. The talk will cover the intuitions behind CostIt's design.

### Detecting Microbial Resistance Against Antibiotics

Andreas Keller
Fachrichtung Informatik - Saarbrücken
Joint Lecture Series
06 May 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
tbd

### Incremental Parallel and Distributed Systems

Pramod Bhatotia
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
07 Apr 2015, 3:30 pm - 4:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Many applications exhibit the property that small, localized changes to the input result in only minor changes to the output. Incremental computations seek to exploit this observation to speed-up successive runs of an application by tracking control and data dependencies, and re-executing only those parts of the computation that are affected by a given input change. To realize the benefits of incremental computation, researchers and practitioners are developing new systems where the programmer can provide efficient update mechanisms for changing application data. Unfortunately, most of the existing solutions are limiting because they either depart from existing programming models, or require programmers to devise an incremental update mechanism (or a dynamic algorithm) on a per-application basis.

The high-level goal of my thesis research is to enable practical, automatic, and efficient incremental computations in parallel and distributed systems. My approach neither requires a radical departure from current models of programming nor complex, application-specific dynamic algorithms.

In this talk, I will first present a high-level description of my thesis research. Thereafter, as a concrete example I will describe my recent project "iThreads". iThreads is a POSIX-compliant threading library to support parallel incremental computation targeting unmodified C/C++ multithreaded programs. The iThreads library can be used as a drop-in replacement for the pthreads library, making it trivial to obtain the benefits of incremental computation by a simple exchange of libraries linked, without even recompiling the application code. To achieve this result, we design our algorithms and implementation to operate at the compiled binary code level by leveraging operating system-specific mechanisms. Our design choice of iThreads tries to strike a balance between efficiency of incremental computation and transparency for unmodified C/C++ pthreads-based programs. Our evaluation on a multicore platform using benchmarks from the PARSEC and Phoenix suites shows significant performance improvements for the majority of applications.

### Building an Operating System for the Data Center

Simon Peter
University of Washington
SWS Colloquium
01 Apr 2015, 10:30 am
Saarbrücken building E1 5, room 29
simultaneous videocast to Kaiserslautern building G26, room 112
Data centers run a range of important applications with ever increasing performance demands, from cloud and server computing to Big Data and eScience. However, the scaling of CPU frequency has stalled in recent years, leading to hardware architectures that no longer transparently scale software performance. Two trends stand out: 1) Instead of frequency, hardware architectures increase the number of CPU cores, leaving complex memory system performance and CPU scheduling tradeoffs exposed to software. 2) Network and storage I/O performance continues to improve, but without matching improvements in CPU frequency. Software thus faces ever increasing I/O efficiency demands.

In my research, I address how operating systems (OSes) can handle these growing complexity and performance pressures to avoid becoming the limiting factor to performance. I first explain how current OS architecture is already inadequate to address these trends, limiting application performance. I then present how the OS can be redesigned to eliminate the performance limitations without compromising on existing features or application compatibility. I finish with an outlook on how these hardware trends affect software in the future and present ideas to address them.

### Tracking Resistance with Dissent

David Wolinsky
Yale University
SWS Colloquium
30 Mar 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
As the most serious cyber-attack threats rapidly shift from untargeted toward increasingly targeted methods, it is becoming more crucial for organizations to protect the identity and location of their members against malicious tracking and surveillance. The common approach by organizations to use encryption is not enough, as metadata, such as the sender and receiver, has recently been shown to be as valuable and hence as dangerous as data. Employing existing anonymous communication tools, such as Tor, only deter but do not prevent tracking and targeting attacks. This talk describes three major challenges to protecting users: network-based traffic analysis attacks, intersection attacks against identities and traffic flows, and application / environment exploits. The talk then introduces practical approaches that address these challenges as implemented and evaluated in the Dissent project.

### Jellyfish: Networking Data Centers, Randomly

Ankit Singla
University of Illinois
SWS Colloquium
26 Mar 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Abstract: Large Internet services, "big science", and increasingly, industrial R&D, are all powered by warehouse scale computing -- tens of thousands of servers connected over a network. Increasing parallelism and big data analytics require that the network provide high throughput connectivity. In this talk, I will describe Jellyfish, our proposed design for such a network. Jellyfish uses a random graph topology, allowing construction at arbitrary sizes, easier network growth over time, and integration of newer and more powerful network equipment as it becomes available -- practical problems that rigidly structured traditional networks fail to address. Surprisingly, Jellyfish also beats state-of-the-art real-world designs on throughput, by as much as 40%. In fact, we show that Jellyfish networks achieve near optimal throughput, i.e., one cannot build, using the same networking equipment, *any* network that provides much higher throughput.

### Interactive Typed Tactic Programming in the Coq Proof Assistant

Luis Francisco Ziliani
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Defense
24 Mar 2015, 4:00 pm - 5:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Proof assistants like Coq are now eminently effective for formalizing "research-grade" mathematics, verifying serious software systems, and, more broadly, enabling researchers to mechanize and breed confidence in their results. Nevertheless, the mechanization of substantial proofs typically requires a significant amount of manual effort. To alleviate this burden, proof assistants typically provide an additional language for tactic programming. Tactics support general-purpose scripting of automation routines, as well as fine control over the state of an interactive proof. However, for most existing tactic languages (e.g., ML, Ltac), the price to pay for this freedom is that the behavior of a tactic lacks any static specification within the base logic of the theorem prover (such as, in Coq, a type). As a result of being untyped, tactics are known to be difficult to compose, debug, and maintain.

In my thesis I present two different approaches to typed tactic programming in the context of Coq: Lemma Overloading and Mtac. Both approaches rely heavily on the unification algorithm of Coq, which is a complex mixture of different heuristics and has not been formally specified. Therefore, I also build and describe a new unification algorithm better suited for tactic programming in Coq. In the talk I will describe the high level achievements of my work and introduce in depth Mtac, a new programming language for tactic programming.

### The Persistent Homology Pipeline: Shapes, Computations, and Applications

Michael Kerber
MPI-INF - D1
Joint Lecture Series
04 Mar 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Data analysis is the task of extracting relevant information from a possibly large and possibly noisy data collection. A commonly desired goal is to obtain global summaries of the data which help to understand the data on a high-level and simplify the exploration process. Interpreting data as geometric input, such summaries are provided by topological invariants, for instance the homology of spaces, which leads to the field of topological data analysis. The last 20 years have witnessed a boost of this research area, mostly due to the development of persistent homology. This is a theory to make topological invariants robust with respect to noise and yielding a topological multi-scale summary of data.

The success of persistent homology has posed the challenge of computing persistence on large data sets. Typical questions in this context are: How can we efficiently build combinatorial cell complexes out of point cloud data? How can we compute the persistence summaries of very large cell complexes in a scalable way? Finally, how does the computed summary lead us to new insights into the considered application? My talk will introduce the theory of persistent homology on an informal level, discuss some recent algorithmic advances and survey some application areas of topological data analysis.

### Principled and Practical Web Application Security

Deian Stefan
Stanford University
SWS Colloquium
02 Mar 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Large-scale private user data theft has become a common occurrence on the web. A huge factor in these privacy breaches we hear so much about is that developers specify and enforce data security policies by strewing checks throughout their application code. Overlooking even a single check can lead to vulnerabilities.

In this talk, I will describe a new approach to protecting sensitive data even when application code is buggy or malicious. The key ideas behind my approach are to separate the security and privacy concerns of an application from its functionality, and to use language-level information flow control (IFC) to enforce policies throughout the code. The main challenge of this approach is at once to design practical systems that can be easily adopted by average developers, and simultaneously to leverage formal semantics that rule out large classes of design error. The talk will cover a server-side web framework (Hails), a language-level IFC system (LIO), and a browser security architecture (COWL), which, together, provide end-to-end security against the privacy leaks that plague today's web applications.

### Minimal Trusted Hardware Assumptions for Privacy-Preserving Systems

Aniket Kate
Saarland University
SWS Colloquium
26 Feb 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Trusted hardware modules are becoming prevalent in computing devices of all kinds. A broad trusted hardware assumption purports to solve almost all security problems in a trivial and uninteresting manner. However, relying entirely on hardware assumptions to achieve security goals of a system can be impractical given the limited memory, bandwidth and CPU capabilities of available hardware modules, and makes the designed system vulnerable to even a tiny overlooked or undiscovered flaw/side-channel in the employed module. Thus, the key challenge to me while designing a trusted hardware-based system is to determine a minimal hardware assumption required to achieve the system's goals, and justify the assumption for an available hardware module.

In this talk, I will present my recent work on developing privacy-preserving systems based on the above insight. In particular, I will introduce a privacy-preserving transaction protocol for credit networks (PrivPay), an architecture for privacy-preserving online behavioral advertising (ObliviAd), and an asynchronous multiparty computation protocol with only an honest majority (NeqAMPC).

### Building fast and consistent replicated systems: from principles to practice

Cheng Li
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
23 Feb 2015, 3:00 pm - 4:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Distributing data across replicas within a data center or across multiple data centers plays an important role in building Internet-Scale services that provide good user experience, namely low latency access and high throughput. This approach often compromises on strong consistency semantics, which help maintain application-specific desired properties. To relieve such inherent tension, in the past few years, many proposals have been designed to allow programmers to selectively weaken consistency levels of certain operations to avoid costly immediate coordination for concurrent user requests. However, these proposals fail to provide principles to guide programmers to make a correct decision of assigning consistency levels to various operations so that good performance is extracted while system behaviors still comply with specifications.

The primary goal of my work is to provide programmers with principles and tools for building fast and consistent systems by allowing programmers to think about various consistency levels in the same framework. In order to meet this goal, we plan to develop the following three components. The first component of my thesis is a novel consistency model called RedBlue Consistency [24], which presents sufficient conditions that allow programmers to safely separate weakly consistent operations from strongly consistent ones in a coarse-grained manner. The second component is SIEVE [23] - a tool that explores both Commutative Replicated Data Types and program analysis techniques to assign proper consistency levels (either weak or strong consistency) to different operations and to maximize the weakly consistent operation space. The third component is Partial Order-Restrictions Consistency (PoR Consistency) - a generic consistency definition that generalizes the trade-off between consistency and performance by capturing various consistency levels in terms of visibility restrictions among operations. The power of PoR Consistency is allowing programmers to tune the restrictions to obtain a fine-grained control of their targeted consistency semantics.

### Privacy, Security, and Online Disclosures:Combining HCI and Behavioral Science to Design Visceral Cues for Detection of Online Threats

Laura Brandimarte
Carnegie Mellon University
SWS Colloquium
19 Feb 2015, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Online privacy and security decision making is complex, because it is affected both by objective risks and benefits from disclosure or protection of personal information, and by factors that do not directly affect economic trade-offs. For instance, design features - such as default visibility settings, look & feel of a website, granularity of privacy controls, or framing of privacy policies - as well as cognitive and behavioral biases affect risk perception. Behavioral sciences provide useful insights on how people respond to risks and threats. Of particular interest to my research is whether individuals detect, recognize, and react differently to "offline" and "online" threats. I will present a series of laboratory experiments, that combine findings from HCI and behavioral sciences, showing how to help users of online sharing technologies detect online privacy and security threats, and thus make better informed decisions. The experiments demonstrate how sensorial, visceral stimuli from the offline, physical world can affect online privacy concern and online disclosures. The results show the importance of going beyond privacy and security usability research, and provide suggestions on how to improve interfaces to help users make sound privacy and security decisions.

### Programming with Numerical Uncertainties

Dr. Eva Darulova
EPFL, Switzerland
SWS Colloquium
17 Feb 2015, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Numerical software, common in scientific computing or embedded systems, inevitably uses an approximation of the real arithmetic in which most algorithms are designed. Finite-precision arithmetic, such as fixed-point or floating-point, is a common and efficient choice, but introduces an uncertainty on the computed result that is often very hard to quantify. We need adequate tools to estimate the errors introduced in order to choose suitable approximations which satisfy the accuracy requirements. I will present a new programming model where the scientist writes his or her numerical program in a real-valued specification language with explicit error annotations. It is then the task of our verifying compiler to select a suitable floating-point or fixed-point data type which guarantees the needed accuracy. I will show how a combination of SMT theorem proving, interval and affine arithmetic and function derivatives yields an accurate, sound and automated error estimation which can handle nonlinearity, discontinuities and certain classes of loops. We have further combined our error computation with genetic programming to not only verify but also improve accuracy. Finally, together with techniques from validated numerics we developed a runtime technique to certify solutions of nonlinear systems of equations, quantifying truncation in addition to roundoff errors.

### There is more to gaze than meets the eye: novel uses of gaze in context-aware computing and human-computer interaction

Andreas Bulling
MPI-INF - D2
Joint Lecture Series
04 Feb 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Interactions between humans are complex, highly adaptive to the situation at hand, and rely to a large extent on non-verbal behavioural cues, most importantly visual and physical behaviour (aka body language). Despite significant advances in affective computing and social signal processing, current human-computer systems still fail short in leveraging the full potential of these cues. We envision human-computer systems that fully exploit the large information content available in non-verbal human behaviour and thereby offer human-like perceptual and interactive capabilities. Realising this vision is challenging and requires scientific advances along three dimensions: human behaviour sensing, behaviour analysis and modelling, as well as application development. With a focus on human visual behaviour, in my talk I will provide an overview of previous and ongoing work towards our vision along these three dimensions.

### Trace Complexity of Information Diffusion

Alessandro Panconesi
Informatica - Sapienza, Università di Roma
SWS Distinguished Lecture Series
26 Jan 2015, 10:30 am - 11:30 am
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbrücken building E1 5, room 002
Abstract: In recent years, we have witnessed the emergence of many sophisticated web services that allow people to interact on an unprecedented scale. The wealth of data produced by these new ways of communication can be used, in principle, to increase our understanding of human social behaviour, but a fundamental hurdle is posed by the sensitivity of these data. Access must of necessity be severely constrained in order to protect the privacy of the users and the confidentiality of the data. A very broad question arises naturally: can non-trivial conclusions about various social processes be inferred based only on such limited information? We give a few specific examples taken from our own research of what can, and cannot, be learned from digital traces. The talk describes joint work with several people: B.Abrahao, P.Brach, F.Chierichetti, R.Kleinberg, A.Epasto, P.Sankowski

### Reasoning about trustworthiness of identities in social computing systems

Bimal Viswanath
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
16 Jan 2015, 4:00 pm - 5:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
The huge popularity of social computing platforms such as Facebook, Google+, Twitter, Yelp and Twitter has attracted rampant forms of service abuse on these platforms. While this degrades a user's experience, service abuse can also have serious economic consequences; click fraud in social ad platforms is a significant concern for advertisers. The main cause for this lack of reliability is the identity infrastructure: in most systems, the users operate behind weak identities that are easy to create. The use of weak identities makes these systems vulnerable to Sybil attacks, where an attacker creates multiple fake (Sybil) identities to manipulate and abuse the system. Thus, the key research challenge is: Given a weak identity-based social computing system, how can we reason about the (un)trustworthiness of identities in the system? In this thesis, I propose various schemes to address this problem.

In this talk, I will first present a high level overview of my thesis research where I will describe the limitations of existing approaches to reason about trust and how my work addresses those limitations. Next, I will present my most recent project "Robust Tamper Detection in Crowd Computations". Popular social computing systems increasingly rely on crowd computing to rate and rank content, users, products and businesses. Today, attackers who create Sybil identities can easily tamper with these computations. Existing defenses that largely focus on detecting individual Sybil identities have a fundamental limitation: Adaptive attackers can create hard-to-detect Sybil identities to tamper arbitrary crowd computations. We propose Stamper, an approach for detecting tampered crowd computations. Stamper gains strength in numbers - we propose statistical analysis techniques that can determine if a large crowd computation has been tampered by Sybils, even when it is fundamentally hard to infer which of the participating identities are Sybil.

### Putting Threads on a Solid Foundation: Some Remaining Issus

Hans Boehm
SWS Distinguished Lecture Series
09 Jan 2015, 3:00 pm - 4:30 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
Shared memory parallel machines have existed since the 1960s and, with the proliferation of multi-core processors, shared memory parallel programming has become increasingly important. Unfortunately, some fairly basic questions about parallel program semantics have not been addressed until quite recently. Multi-threaded programming languages are on much more solid ground than they were as recently as a decade ago. Nonetheless a number of foundational issues have turned out to be surprisingly subtle and resistant to easy solutions. We briefly survey three such issues:

1) All major programming languages support atomic operations that enforce only very weak memory ordering properties. These implicitly rely on enforcement of dependencies to preclude certain "nonsensical" outcomes. Unfortunately, we do not know how to define a suitable notion of dependency.

2) Garbage collected programming languages such as Java, rely on finalization, most commonly to interact correctly with non-garbage-collected parts of the system. But finalization raises largely unsolved concurrency issues. As a result, almost all interesting uses of finalization uses are subtly, but seriously, incorrect.

3) It can be difficult to avoid object destruction or memory deallocation races in-garbage-collected languages like C++. As a result, async(), the basic C++11 thread spawning facility has surprisingly complex semantics, with unintended consequences.

### Analyzing Dynamics of Choice among Discrete Alternatives

Andrew Tomkins
SWS Distinguished Lecture Series
09 Jan 2015, 10:00 am - 11:30 am
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111
In this talk we'll consider two problems in which users must select from a set of alternatives. In the first scenario, a user consumes a class of item repeatedly, such as listening to a sequence of songs, or visiting a sequence of restaurants over time. Our goal is to understand the dynamics of repeated consumption of the same item. We present a model related to Herbert Simon's 1955 copying model, and analyze its effectiveness. In the second scenario, a user traverses a directed graph whose nodes represent items, and whose arcs represent related items recommended by the system. In this setting, we develop a model and algorithm for determining the underlying quality of each node based on traversal data. Our result provides a well-motivated unique solution to the problem of "reverse engineering" a markov chain by finding a transition matrix given the graph and the steady state

### Shaping Social Activity by Incentivizing Users

Manuel Gomez Rodriguez
Max Planck Institute for Software Systems
Joint Lecture Series
07 Jan 2015, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives.