Events 2019 – MPI SWS

Computer-Aided Programming Across Software Stack

Işıl Dillig University of Texas at Austin

16 Dec 2019, 10:30 am - 11:30 am

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Distinguished Lecture Series

Program synthesis techniques aim to generate executable programs from some high-level of expression on user intent, such as logical specifications, examples, or naive reference implementation. This talk will survey different flavors of program synthesis and their applications across the entire software stack, ranging from computer end-users all the way to systems programmers. We will also illustrate how program synthesis is useful for addressing different concerns in software development, including functionality, correctness, performance, and security.

WebAssembly: Mechanisation, Security, and Concurrency

Conrad Watt University of Cambridge

12 Dec 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

WebAssembly is the first new language to be introduced to the Web ecosystem in over 20 years. Its official specification is given as a formal semantics, making the language a perfect target for further applications of formal methods. This talk highlights recent work which builds on this formal semantics, and discusses the ongoing development of WebAssembly's relaxed memory model, which is complicated by the language's inter-operation with JavaScript.

Understanding and Evolving the Rust Programming Language

Ralf Jung Max Planck Institute for Software Systems

10 Dec 2019, 10:00 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 8796

SWS Student Defense Talks - Thesis Proposal

Rust is a young and actively developed "systems programming language" that promises to overcome the seemingly fundamental tradeoff between high-level safety guarantees and low-level control over resource management. In this talk I will present my thesis work on understanding and evolving this groundbreaking new language.

The first part of the talk is about building formal foundations for the existing Rust language. Rust employs a strong, ownership-based type system to rule out common systems programming anomalies, ...

Rust is a young and actively developed "systems programming language" that promises to overcome the seemingly fundamental tradeoff between high-level safety guarantees and low-level control over resource management. In this talk I will present my thesis work on understanding and evolving this groundbreaking new language.

The first part of the talk is about building formal foundations for the existing Rust language. Rust employs a strong, ownership-based type system to rule out common systems programming anomalies, but then extends the expressive power of this core type system through libraries that internally use unsafe features. Unfortunately, this ``extensible'' aspect of the Rust language makes Rust's safety claims rather non-obvious, and there is good reason to question whether they actually hold. To address this, we have developed RustBelt, the first formal (and machine-checked) safety proof for a language representing a realistic subset of Rust. Our proof supports the extensible nature of Rust's safety story in the sense that, for each new Rust library that uses unsafe features, we can say what verification condition it must satisfy in order for it to be deemed a safe and compositional extension to the language.

The second part is about evolving the semantics of Rust to better support compiler optimizations. In particular, Rust's reference types provide strong aliasing guarantees, which a clever optimizing compiler should be able to exploit. However, if no care is taken, unsafe code (which plays an essential role in the Rust ecosystem, as mentioned above) can subvert these guarantees. To address this, we have developed Stacked Borrows, a new operational semantics for memory accesses in Rust. On the one hand, Stacked Borrows defines an aliasing discipline and declares programs violating it to have undefined behavior, thus making it possible for us to formally verify the soundness of a number of useful type-based intraprocedural optimizations. On the other hand, we have also implemented the Stacked Borrows semantics in Miri, an interpreter for Rust, and run large parts of the Rust standard library test suite in Miri. In so doing, we have validated that our semantics is sufficiently permissive to account for real-world unsafe Rust code.

Towards Usability in Private Data Analytics

Reinhard Munz Max Planck Institute for Software Systems

06 Dec 2019, 2:00 pm - 3:00 pm

Kaiserslautern building G26, room 607
simultaneous videocast to Saarbrücken building E1 5, room 005 / Meeting ID: 6312

SWS Student Defense Talks - Thesis Defense

Private data analytics systems preferably provide required analytic accuracy to analysts and specified privacy to individuals whose data is analyzed. Devising a general system that works for a broad range of datasets and analytic scenarios has proven to be difficult.

Differentially private systems allow for proven formal privacy guarantees. To provide these guarantees, however, differentially private mechanisms often need to add large amounts of noise to statistical results, which impairs their usability. Thus, despite the advent of mechanisms with formal guarantees, ...

Private data analytics systems preferably provide required analytic accuracy to analysts and specified privacy to individuals whose data is analyzed. Devising a general system that works for a broad range of datasets and analytic scenarios has proven to be difficult.

Differentially private systems allow for proven formal privacy guarantees. To provide these guarantees, however, differentially private mechanisms often need to add large amounts of noise to statistical results, which impairs their usability. Thus, despite the advent of mechanisms with formal guarantees, industry still uses inferior ad-hoc mechanisms that provide better analytic accuracy.

In my thesis I follow two approaches to improve the usability of private data analytics systems in general and differentially private systems in particular. First, I revisit ad-hoc mechanisms and explore the possibilities of systems that do not provide Differential Privacy or only a weak version thereof. Based on an attack analysis, I devise a set of new protection mechanisms including Query Based Bookkeeping (QBB). In contrast to previous systems, QBB only requires the history of analysts’ queries in order to provide privacy protection, but does NOT require any knowledge about the protected individuals’ data.

In my second approach I use the insights gained with QBB to propose UniTraX, the first differentially private analytics system that allows analysts to analyze part of a protected dataset without affecting the other parts and without giving up on accuracy. I show UniTraX’s usability by way of multiple case studies on real-world datasets across different domains. In all cases I demonstrate that UniTraX allows more queries than previous differentially private data analytics systems at moderate runtime overheads.

Stronger Higher-order Automation

Sophie Tourret MPI-INF - RG 1

04 Dec 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Automated reasoning in first-order logic (FOL) is becoming a mature research domain. It has led to the development of powerful tools such as superposition-based theorem provers and SMT solvers (Satisfiability Modulo Theory solvers), that have found and continue to find many applications in industry and research.

One such application is the automation of proofs in interactive theorem proving (ITP), where proof assistants are used to write computer-checked proofs of theorems, generally expressed in a variant of higher-order logic (HOL). ...

Automated reasoning in first-order logic (FOL) is becoming a mature research domain. It has led to the development of powerful tools such as superposition-based theorem provers and SMT solvers (Satisfiability Modulo Theory solvers), that have found and continue to find many applications in industry and research.

One such application is the automation of proofs in interactive theorem proving (ITP), where proof assistants are used to write computer-checked proofs of theorems, generally expressed in a variant of higher-order logic (HOL). This automation is realised via "hammers", that attempt to delegate the proof obligations to first-order automated provers. However, in the translation from HOL to FOL, hammers obfuscate the structure of terms and formulas through the application of a sequence of encodings, although it is this very structure that the provers exploit to find proofs efficiently.

This situation is less than ideal, and if until a few years ago, the ITP and ATP communities were working in parallel, mostly ignoring each other, there is nowadays a trend pushing to bring the two communities closer. The work that I will present in this talk is part of this trend. It includes ongoing research that is either improving higher-order automation with respect to ITP applications or using ITP as a vehicle for ATP research.

Prusti – Deductive Verification for Rust

Alex Summers ETH Zurich

03 Dec 2019, 10:30 am - 12:00 pm

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

Producing reliable systems software is a major challenge, plagued by the ubiquitous problems of shared mutable state, pointer aliasing, dynamic memory management, and subtle concurrency issues such as race conditions; even expert programmers struggle to tame the wide variety of reasons why their programs may not behave as they intended. Formal verification offers potential solutions to many of these problems, but typically at a very high price: the mathematical techniques employed are highly-complex, and difficult for even expert researchers to understand and apply. ...

Producing reliable systems software is a major challenge, plagued by the ubiquitous problems of shared mutable state, pointer aliasing, dynamic memory management, and subtle concurrency issues such as race conditions; even expert programmers struggle to tame the wide variety of reasons why their programs may not behave as they intended. Formal verification offers potential solutions to many of these problems, but typically at a very high price: the mathematical techniques employed are highly-complex, and difficult for even expert researchers to understand and apply.

The relatively-new Rust programming language is designed to help with the former problem: a powerful ownership type system requires programmers to specify and restrict their discipline for referencing heap locations, providing in return the strong guarantee (almost – the talk, and Rustbelt!) that code type-checked by this system will be free from dangling pointers, unexpected aliasing, race conditions and the like. While this rules out a number of common errors, the question of whether a program behaves as intended remains.

In this talk, I’ll give an overview of the Prusti project, which leverages Rust’s type system and compiler analyses for formal verification. By combining the rich information available about a type-checked Rust program with separate user-specification of intended behaviour, Prusti enables a user to verify functional correctness of their code without interacting with a complex program logic; in particular, specifications and all interactions with our implemented tool are at the level of abstraction of Rust expressions.

Personal Knowledge Extraction: What Can Be Inferred From What You Say and Do

Paramita Mirza MPI-INF - D5

06 Nov 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Despite recent advances in natural language processing and generation, communication between humans and machines is in still its infancy. Existing intelligent home and mobile assistant technologies excel at scripted tasks such as weather or news reports and music control, yet typically fail at more advanced personalization. This calls for a centralized repository for personal knowledge about each user, which will then be a distant source of background knowledge for personalization in downstream applications. Such personal knowledge repository will be beneficial as a reusable asset; ...

Despite recent advances in natural language processing and generation, communication between humans and machines is in still its infancy. Existing intelligent home and mobile assistant technologies excel at scripted tasks such as weather or news reports and music control, yet typically fail at more advanced personalization. This calls for a centralized repository for personal knowledge about each user, which will then be a distant source of background knowledge for personalization in downstream applications. Such personal knowledge repository will be beneficial as a reusable asset; it should be both explainable and scrutable, giving full control to the owning user on editing and sharing stored information with selected service providers.

In this talk, I will discuss our efforts on automated personal knowledge extraction. We can easily obtain personal knowledge of famous people from biographies or news articles, however, such resources are not available for ordinary users. Hence, we turn to the task of inferring personal attributes from users' utterances in conversations, e.g., guessing a person's occupation from "I was sitting the whole day in front of my computer today, trying to finish a grant proposal for my research." I will highlight our Hidden Attribute Models (HAM) to solve the problem, a neural architecture leveraging attention mechanisms and embeddings, as well as an ongoing work on its extension to address challenging attributes such as hobbies and travel preferences with wide sets of multi-faceted attribute values. Finally, I will present an outlook on what we can further infer from users' activities, particularly in relation with their mood and emotion.

Dealing with Epidemics under Uncertainty

Jessica Hoffmann University of Texas at Austin

04 Nov 2019, 10:30 am - 11:30 am

Saarbrücken building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

Epidemic processes can model anything that spreads. As such, they are a useful tool for studying not only human diseases, but also network attacks, chains of activation in the brain, the propagation of real or fake news, the spread of viral tweets, and other processes. In this talk, we investigate epidemics spreading on a graph in the presence of various forms of uncertainty. We present in particular a result about controlling the spread of an epidemic when there is uncertainty about who exactly is infected. ...

Epidemic processes can model anything that spreads. As such, they are a useful tool for studying not only human diseases, but also network attacks, chains of activation in the brain, the propagation of real or fake news, the spread of viral tweets, and other processes. In this talk, we investigate epidemics spreading on a graph in the presence of various forms of uncertainty. We present in particular a result about controlling the spread of an epidemic when there is uncertainty about who exactly is infected. We show first that neither algorithms nor results are robust to uncertainty. In other words, uncertainty fundamentally changes how we must approach epidemics on graphs. We also present two related results about learning the graph underlying an epidemic process when there is uncertainty about when people were infected or what infected them.

Knowledge and Information Dissemination: Models and Methods

Utkarsh Upadhyay Max Planck Institute for Software Systems

17 Oct 2019, 4:00 pm - 5:00 pm

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Student Defense Talks - Thesis Proposal

In the past, information and knowledge dissemination was relegated to the brick-and-mortar classrooms, newspapers, radio, and television. As these processes were simple and centralized, the models behind them were well understood and so were the empirical methods for optimizing them. In today's world, the internet and social media has become a powerful tool for information and knowledge dissemination: Wikipedia gets more than 1 million edits per day, Stack Overflow has more than 17 million questions, 25% of US population visits Yahoo! ...

In the past, information and knowledge dissemination was relegated to the brick-and-mortar classrooms, newspapers, radio, and television. As these processes were simple and centralized, the models behind them were well understood and so were the empirical methods for optimizing them. In today's world, the internet and social media has become a powerful tool for information and knowledge dissemination: Wikipedia gets more than 1 million edits per day, Stack Overflow has more than 17 million questions, 25% of US population visits Yahoo! News for articles and discussions, Twitter has more than 60 million active monthly users, and Duolingo has 25 million users learning languages online.

These developments have introduced a paradigm shift in the process of dissemination. Not only has the nature of the task moved from being centralized to decentralized, but the developments have also blurred the boundary between the creator and the consumer of the content, i.e., information and knowledge. These changes have made it necessary to develop new models, which are better suited to understanding and analysing the dissemination, and to develop new methods to optimize them.

At a broad level, we can view the participation of users in the process of dissemination as falling in one of two settings: collaborative or competitive. In the collaborative setting, the participants work together in crafting knowledge online, e.g., by asking questions and contributing answers, or by discussing news or opinion pieces. In contrast, as competitors, they vie for the attention of their followers on social media. The first part of the thesis will propose models for the complexity of discussions and the evolution of expertise. The latter part of the thesis will explore the competitive setting where I will propose computational methods for measuring, and increasing, the attention received from followers on social media.

Non-Reformist Reform for Haskell Modularity

Scott Kilpatrick Max Planck Institute for Software Systems

15 Oct 2019, 3:00 pm - 4:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6747

SWS Student Defense Talks - Thesis Defense

Module systems like that of Haskell permit only a weak form of modularity in which module implementations depend directly on other implementations and must be processed in dependency order. Module systems like that of ML, on the other hand, permit a stronger form of modularity in which explicit interfaces express assumptions about dependencies and each module can be typechecked and reasoned about independently.

In this thesis, I present Backpack, a new language for building separately-typecheckable packages on top of a weak module system like Haskell’s. ...

Module systems like that of Haskell permit only a weak form of modularity in which module implementations depend directly on other implementations and must be processed in dependency order. Module systems like that of ML, on the other hand, permit a stronger form of modularity in which explicit interfaces express assumptions about dependencies and each module can be typechecked and reasoned about independently.

In this thesis, I present Backpack, a new language for building separately-typecheckable packages on top of a weak module system like Haskell’s. The design of Backpack is the first to bring the rich world of type systems to the practical world of packages via mixin modules. It’s inspired by the MixML module calculus of Rossberg and Dreyer but by choosing practicality over expressivity Backpack both simplifies that semantics and supports a flexible notion of applicative instantiation. Moreover, this design is motivated less by foundational concerns and more by the practical concern of integration into Haskell. The result is a new approach to writing modular software at the scale of packages.

The semantics of Backpack is defined via elaboration into sets of Haskell modules and binary interface files, thus showing how Backpack maintains interoperability with Haskell while retrofitting it with interfaces. In my formalization of Backpack I present a novel type system for Haskell modules and I prove a key soundness theorem to validate Backpack’s semantics.

Are We Susceptible to Rowhammer? An End-to-End Methodology for Cloud Providers

Stefan Saroiu Mircosoft Research, Redmond

07 Oct 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 113 / Meeting ID: 6312

SWS Colloquium

Cloud providers are nervous about recent research showing how Rowhammer attacks affect many types of DRAM including DDR4 and ECC-equipped DRAM. Unfortunately, cloud providers lack a systematic way to test the DRAM present in their servers for the threat of a Rowhammer attack. Building such a methodology needs to overcome two difficult challenges: (1) devising a CPU instruction sequence that maximizes the rate of DRAM row activations on a given system, and (2) determining the adjacency of rows internal to DRAM. ...

Cloud providers are nervous about recent research showing how Rowhammer attacks affect many types of DRAM including DDR4 and ECC-equipped DRAM. Unfortunately, cloud providers lack a systematic way to test the DRAM present in their servers for the threat of a Rowhammer attack. Building such a methodology needs to overcome two difficult challenges: (1) devising a CPU instruction sequence that maximizes the rate of DRAM row activations on a given system, and (2) determining the adjacency of rows internal to DRAM. This talk will present an end-to-end methodology that overcomes these challenges to determine if cloud servers are susceptible to Rowhammer attacks. With our methodology, a cloud provider can construct worst-case testing conditions for DRAM.

We used our methodology to create worst-case DRAM testing conditions on the hardware used by a major cloud provider for a recent generation of its servers. Our findings show that none of the instruction sequences used in prior work to mount Rowhammer attacks create worst-case DRAM testing conditions. Instead, we construct an instruction sequence that issues non-explicit load and store instructions. Our new sequence leverages microarchitectural side-effects to ``hammer'' DRAM at a near-optimal rate on modern Skylake platforms. We also designed a DDR4 fault injector capable of reverse engineering row adjacency inside a DRAM device. When applied to our cloud provider's DIMMs, we find that rows inside DDR4 DRAM devices do not always follow a linear map.

Joint work with Lucian Cojocar (VU Amsterdam), Jeremie Kim, Minesh Patel, Onur Mutlu (ETH Zurich), Lily Tsai (MIT), and Alec Wolman (MSR)

Efficient Optimization for Very Large Combinatorial Problems in Computer Vision and Machine Learning

Paul Swoboda MPI-INF - D2

02 Oct 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

In computer vision and machine learning combinatorial optimization problems are widespread, typically NP-hard and tend to pose unique challenges due to their very large scale and problem structure. Established techniques from the mathematical optimization community cannot cope with the encountered problem sizes and do not exploit special problem characteristics. In this talk I will present several new solution paradigms for solving large scale combinatorial problems in computer vision efficiently and to high accuracy. I will discuss how these principles can be applied on classical problems of combinatorial optimization that have found wide use in computer vision, ...

In computer vision and machine learning combinatorial optimization problems are widespread, typically NP-hard and tend to pose unique challenges due to their very large scale and problem structure. Established techniques from the mathematical optimization community cannot cope with the encountered problem sizes and do not exploit special problem characteristics. In this talk I will present several new solution paradigms for solving large scale combinatorial problems in computer vision efficiently and to high accuracy. I will discuss how these principles can be applied on classical problems of combinatorial optimization that have found wide use in computer vision, machine learning and computer graphics, namely inference in Markov Random Fields, the quadratic assignment problem and graph decomposition. Lastly, I will show empirical results showing the great practical performance of the presented techniques.

Toward Cognitive Security

Claude Castelluccia InRIA

02 Oct 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

Online services, devices or secret services are constantly collecting data and meta-data from users. This data collection is mostly used to target users or customise their services. However, as illustrated by the Cambridge Analytica case, data and technologies are more and more used to manipulate, influence or shape people's opinions online, i.e. to "hack" our brains. In this context, it is urgent to develop the field of "Cognitive security" in order to better comprehend these attacks and provide counter-measures. ...

Online services, devices or secret services are constantly collecting data and meta-data from users. This data collection is mostly used to target users or customise their services. However, as illustrated by the Cambridge Analytica case, data and technologies are more and more used to manipulate, influence or shape people's opinions online, i.e. to "hack" our brains. In this context, it is urgent to develop the field of "Cognitive security" in order to better comprehend these attacks and provide counter-measures. This talk will introduce the concept of "Cognitive security". We will explore the different types of cognitive attacks and discuss possible research directions.

Human-Centered Design and Data Science for Good

Maria Rauschenberger Universitat Pompeu Fabra

30 Sep 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112 / Meeting ID: 9312

SWS Colloquium

How can we make better applications for social impact issues? For example, the combination of Human-Centered Design (HCD) and Data Science (DS) can be the answer to avoid biases in the collection of data with online-experiments and the analysis of small data. This presentation shows how we combine HCD and DS to design applications and analyze the collected data for Good. We will focus mainly on the project: "Early screening of dyslexia using a language-independent content game and machine learning". ...

How can we make better applications for social impact issues? For example, the combination of Human-Centered Design (HCD) and Data Science (DS) can be the answer to avoid biases in the collection of data with online-experiments and the analysis of small data. This presentation shows how we combine HCD and DS to design applications and analyze the collected data for Good. We will focus mainly on the project: "Early screening of dyslexia using a language-independent content game and machine learning". With our two designed games (MusVis and DGames), we collected data sets (313 and 137 participants) in different languages (mainly Spanish and German) and evaluated them with machine learning classifiers. For MusVis, we mainly use content that refers to one single acoustic or visual indicator, while DGames content refers to generic content related to various indicators. Our results open the possibility of low-cost and early screening of dyslexia through the Web. In this talk, we will further address the techniques used from HCD and DS to reach these results.

Accelerating Network Applications with Stateful TCP Offloading

YoungGyoun Moon KAIST

24 Sep 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

The performance of modern key-value servers or layer-7 load balancers often heavily depends on the efficiency of the underlying TCP stack. Despite numerous optimizations such as kernel-bypassing and zero-copying, performance improvement for TCP applications is fundamentally limited due to the protocol conformance overhead for compatible TCP operations.

In this talk, I will introduce AccelTCP, a hardware-assisted TCP stack architecture that harnesses programmable network interface cards (NICs) as a TCP protocol accelerator. AccelTCP can offload complex TCP operations such as connection setup and teardown completely to NIC, ...

The performance of modern key-value servers or layer-7 load balancers often heavily depends on the efficiency of the underlying TCP stack. Despite numerous optimizations such as kernel-bypassing and zero-copying, performance improvement for TCP applications is fundamentally limited due to the protocol conformance overhead for compatible TCP operations.

In this talk, I will introduce AccelTCP, a hardware-assisted TCP stack architecture that harnesses programmable network interface cards (NICs) as a TCP protocol accelerator. AccelTCP can offload complex TCP operations such as connection setup and teardown completely to NIC, which frees a significant amount of host CPU cycles for application processing. In addition, for layer-7 proxies, it supports running connection splicing on NIC so that the NIC relays all packets of the spliced connections with zero DMA overhead. We showcase the effectiveness of AccelTCP with two real-world applications: (1) Redis, a popular in-memory key-value store, and (2) HAProxy, a widely-used layer-7 load balancer. Our evaluation shows that AccelTCP improves their performance by 2.3x and 11.9x, respectively.

Synthesis from within: implementing automated synthesis inside an SMT solver

Cesare Tinelli University of Iowa

16 Sep 2019, 10:30 am - 11:30 am

Kaiserslautern building G26, room 111
simultaneous videocast to Kaiserslautern building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

Recent research in automated software synthesis from specifications or observations has leveraged the power of SMT solvers in order to explore the space of synthesis conjectures efficiently. In most of this work, synthesis techniques are built around a backend SMT solver which is used as a black-box reasoning engine. In this talk, I will describe a successful multiyear research effort by the developers of the SMT solver CVC4 that instead incorporates synthesis capabilities directly within the solver, ...

Recent research in automated software synthesis from specifications or observations has leveraged the power of SMT solvers in order to explore the space of synthesis conjectures efficiently. In most of this work, synthesis techniques are built around a backend SMT solver which is used as a black-box reasoning engine. In this talk, I will describe a successful multiyear research effort by the developers of the SMT solver CVC4 that instead incorporates synthesis capabilities directly within the solver, and the discuss the advances in performance and scope made possible by this approach.

Computational Fabrication: 3D Printing and Beyond

Vahid Babaei MPI-INF - D4

04 Sep 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

The objective of my talk is to introduce the audience to the exciting field of computational fabrication. The recent, wide availability of 3D printers has triggered considerable interest in academia and industry. Computer scientists could engage with hands-on 3D printing and very soon realize the immense but untapped potential of the manufacturing industry for computational methods. In this talk, I will explain the principles of 3D printing (also known as additive manufacturing) both from hardware and software viewpoints. ...

The objective of my talk is to introduce the audience to the exciting field of computational fabrication. The recent, wide availability of 3D printers has triggered considerable interest in academia and industry. Computer scientists could engage with hands-on 3D printing and very soon realize the immense but untapped potential of the manufacturing industry for computational methods. In this talk, I will explain the principles of 3D printing (also known as additive manufacturing) both from hardware and software viewpoints. I will then show examples of recent research addressing computational problems in both 3D printing and general manufacturing. I will also discuss my main research interest, i.e. computational fabrication of visual appearance. Appearance of objects is among their most important and most complicated properties that influence or in numerous cases define their function. I show that additive manufacturing provides unprecedented opportunities to create products with novel and useful appearance properties.

A type theory for amortized resource analysis

Vineet Rajani Max Planck Institute for Software Systems

27 Aug 2019, 2:00 pm - 3:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Student Defense Talks - Thesis Proposal

Amortized analysis is a standard algorithmic technique for estimating upper bounds on the average costs of functions, specifically operations on data structures. This thesis intends to develop λ-amor, a type-theory for amortized analysis of higher-order functional programs. A typical amortized analysis works by storing ghost resource called /potential/ with a data structure's internal state. Accordingly, the central idea in λ-amor is a type-theoretic construct to associate potential with an arbitrary type. Additionally, λ-amor relies on standard concepts from substructural and modal type systems: indexed monads, ...

Amortized analysis is a standard algorithmic technique for estimating upper bounds on the average costs of functions, specifically operations on data structures. This thesis intends to develop λ-amor, a type-theory for amortized analysis of higher-order functional programs. A typical amortized analysis works by storing ghost resource called /potential/ with a data structure's internal state. Accordingly, the central idea in λ-amor is a type-theoretic construct to associate potential with an arbitrary type. Additionally, λ-amor relies on standard concepts from substructural and modal type systems: indexed monads, affine types and indexed exponential types. We show that λ-amor is not only sound (in a very elementary logical relations model), but also very expressive: It can be used to analyze both eager and lazy data structures, and it can embed existing resource analysis frameworks. In fact, λ-amor is /complete/ for the cost analysis of lazy PCF programs. Further, the basic principles behind λ-amor can be adapted (by dropping affineness and adding mutable state) to obtain an expressive type system for a completely unrelated application, namely, information flow control.

The proposal talk will cover the broad setting and the motivation of the work and a significant subset of λ-amor, but due to time constraints, it will not cover all of λ-amor or the adaptation to information flow control. Implementation of the two type theories is not in the scope of the thesis.

Modeling and Individualizing Learning in Computer-Based Environments

Tanja Käser Stanford University

21 Aug 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112 / Meeting ID: 6312

SWS Colloquium

Learning technologies are becoming increasingly important in today's education. This includes game-based learning and simulations, which produce high volume output, and MOOCs (massive open online courses), which reach a broad and diverse audience at scale. The users of such systems often are of very different backgrounds, for example in terms of age, prior knowledge, and learning speed. Adaptation to the specific needs of the individual user is therefore essential. In this talk, I will present two of my contributions on modeling and predicting student learning in computer-based environments with the goal to enable individualization. ...

Learning technologies are becoming increasingly important in today's education. This includes game-based learning and simulations, which produce high volume output, and MOOCs (massive open online courses), which reach a broad and diverse audience at scale. The users of such systems often are of very different backgrounds, for example in terms of age, prior knowledge, and learning speed. Adaptation to the specific needs of the individual user is therefore essential. In this talk, I will present two of my contributions on modeling and predicting student learning in computer-based environments with the goal to enable individualization. The first contribution introduces a new model and algorithm for representing and predicting student knowledge. The new approach is efficient and has been demonstrated to outperform previous work regarding prediction accuracy. The second contribution introduces models, which are able to not only take into account the accuracy of the user, but also the inquiry strategies of the user, improving prediction of future learning. Furthermore, students can be clustered into groups with different strategies and targeted interventions can be designed based on these strategies. Finally, I will also describe lines of future research.

Computer Science for Numerics

Martin Ziegler KAIST

19 Jul 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

Since introduction of the IEEE 754 floating point standard in 1985, numerical methods have become ubiquitous --- and increasingly sophisticated. With growing code complexity of numerical libraries grows the need for rigorous Software Engineering methodology: as provided by Computer Science and state of the art regarding digital processing of discrete data, but lacking in the continuous realm. We apply, adapt, and extend the classical concepts --- specification, algorithmics, analysis, complexity, verification --- from discrete bit strings, ...

Since introduction of the IEEE 754 floating point standard in 1985, numerical methods have become ubiquitous --- and increasingly sophisticated. With growing code complexity of numerical libraries grows the need for rigorous Software Engineering methodology: as provided by Computer Science and state of the art regarding digital processing of discrete data, but lacking in the continuous realm. We apply, adapt, and extend the classical concepts --- specification, algorithmics, analysis, complexity, verification --- from discrete bit strings, integers, graphs etc. to real numbers, converging sequences, smooth/integrable functions, bounded operators, and compact subsets: A new paradigm formalizes mathematical structures as continuous Abstract Data Types with rigorous Turing-computable semantics but without the hassle of actual Turing machines.

Correct Compilation of Relaxed Memory Concurrency

Soham Chakraborty Max Planck Institute for Software Systems

16 Jul 2019, 1:00 pm - 2:00 pm

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 005 / Meeting ID: 6312

SWS Student Defense Talks - Thesis Defense

Shared memory concurrency is the pervasive programming model for multicore architectures such as X86, Power, and ARM. Depending on the memory organization, each architecture follows a somewhat different shared memory model. All these models, however, have one common feature: they allow certain outcomes for concurrent programs that cannot be explained by interleaving execution. In addition to the complexity due to architectures, compilers like GCC and LLVM perform various program transformations, which also affect the outcomes of concurrent programs. ...

Shared memory concurrency is the pervasive programming model for multicore architectures such as X86, Power, and ARM. Depending on the memory organization, each architecture follows a somewhat different shared memory model. All these models, however, have one common feature: they allow certain outcomes for concurrent programs that cannot be explained by interleaving execution. In addition to the complexity due to architectures, compilers like GCC and LLVM perform various program transformations, which also affect the outcomes of concurrent programs.

To be able to program these systems correctly and effectively, it is important to a define a formal language-level concurrency model. For efficiency, it is important that the model is weak enough to allow various compiler optimizations on shared memory accesses as well as efficient mappings to the architectures. For programmability, the model should be strong enough to disallow bogus "out- of-thin-air" executions and provide strong guarantees for well synchronized programs. Because of these conflicting requirements, defining such a formal model is very difficult. This is why, despite years of research, major programming languages such as C/C++ and Java do not yet have completely adequate formal models defining their concurrency semantics.

In this thesis, we address this challenge and develop a formal concurrency model that is very good both in terms of compilation efficiency and of programmability. Unlike most previous approaches, which were defined either operationally or axiomatically on single executions, our formal model is based on event structures, which represents multiple program executions, and thus gives us more structure to define the semantics of concurrency.

In more detail, our formalization has two variants: the weaker version, WEAKEST, and the stronger version, WEAKESTMO. The WEAKEST model simulates the promising semantics proposed by Kang et al., while WEAKESTMO is incomparable to the promising semantics. Moreover, WEAKESTMO discards certain questionable behaviors allowed by the promising semantics. We show that the proposed WEAKESTMO model resolve out-of-thin-air problem, provide standard data-race-freedom (DRF) guarantees, allow the desirable optimizations, and can be mapped to the architectures like X86, PowerPC, ARMv7. Additionally, our models are flexible enough to leverage existing results from the literature. In addition, in order to ensure the correctness of compilation by a major compiler, we developed a translation validator targeting LLVMâ€™s "opt" transformations of concurrent C/C++ programs. Using the validator, we identified a few subtle compilation bugs, which were reported and were fixed. Additionally, we observe that LLVM concurrency semantics differs from that of C11; there are transformations which are justified in C11 but not in LLVM and vice versa. Considering the subtle aspects of LLVM concurrency, we formalized a fragment of LLVMâ€™s concurrency semantics and integrated it into our WEAKESTMO model.

Design Problems: Trustworthy Smart Devices and 3D Printed Lace

Mary Baker HP Labs in Palo Alto

15 Jul 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Distinguished Lecture Series

A growing number of domestic spaces incorporate products that collect data from cameras, microphones and other sensors, leading to privacy concerns. In this talk I report on two user studies performed to learn about perceptions of privacy and trust for sensor-enabled, connected devices such as smart home assistants. The study results suggest that users are more likely to trust devices with materially representative privacy status indicators. This means that the indicators themselves are part of what determines what sensing can take place. ...

A growing number of domestic spaces incorporate products that collect data from cameras, microphones and other sensors, leading to privacy concerns. In this talk I report on two user studies performed to learn about perceptions of privacy and trust for sensor-enabled, connected devices such as smart home assistants. The study results suggest that users are more likely to trust devices with materially representative privacy status indicators. This means that the indicators themselves are part of what determines what sensing can take place. I will describe how we have applied the study results to the design of current devices and what the implications are for the physical design of future smart devices.

Time permitting, I will also talk about my other current passion -- design for additive manufacturing â€“ and what researchers can do to ensure we reach the vastly exciting potential of this method of production. I will bring exotic 3D printed parts to help demonstrate my points.

Automated Program Repair

Abhik Roychoudhury National University of Singapore

08 Jul 2019, 10:30 am - 11:30 am

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Distinguished Lecture Series

Automated program repair is an emerging and exciting field of research, which allows for automated rectification of errors and vulnerabilities. The use of automated program repair can be myriad, such as (a) improving programmer productivity (b) automated fixing of security vulnerabilities as they are detected, (c) self-healing software for autonomous devices such as drones, as well as (d) use of repair in introductory programming education by grading and providing hints for programming assignments. One of the key technical challenges in achieving automated program repair, ...

Automated program repair is an emerging and exciting field of research, which allows for automated rectification of errors and vulnerabilities. The use of automated program repair can be myriad, such as (a) improving programmer productivity (b) automated fixing of security vulnerabilities as they are detected, (c) self-healing software for autonomous devices such as drones, as well as (d) use of repair in introductory programming education by grading and providing hints for programming assignments. One of the key technical challenges in achieving automated program repair, is the lack of formal specifications of intended program behavior. In this talk, we will conceptualize the use of symbolic execution approaches and tools for extracting such specifications. This is done by analyzing a buggy program against selected tests, or against reference implementations. Such specification inference capability can be combined with program synthesis techniques to automatically repair programs. The capability of specification inference also serves a novel use of symbolic execution beyond verification and navigation of large search spaces. Automated program repair via symbolic execution goes beyond search-based approaches which attempt to lift patches from elsewhere in the program. Such an approach can construct "imaginative" patches, serves as a test-bed for the grand- challenge of automated programming, and contributes to the vision of trustworthy self-healing software. Towards the end of the talk, we can put the research on automated repair in light of the overall practice of software security, by sharing some experiences gained at the Singapore Cyber-security Consortium.

The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security

Mario Fritz CISPA

03 Jul 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Today, vast amounts of visual information is collected and often also shared online. Such images and videos can contain various types of privacy-sensitive information that can nowadays be extracted automatically at a large scale, posing a steadily growing threat to users' privacy. I'll give an overview of our efforts towards understanding and controlling privacy in visual information as well as working towards our overall vision of a Visual Privacy Advisor. More generally speaking, we have seen a quick adoption of machine learning technology in a broad range of application scenarios. ...

Today, vast amounts of visual information is collected and often also shared online. Such images and videos can contain various types of privacy-sensitive information that can nowadays be extracted automatically at a large scale, posing a steadily growing threat to users' privacy. I'll give an overview of our efforts towards understanding and controlling privacy in visual information as well as working towards our overall vision of a Visual Privacy Advisor. More generally speaking, we have seen a quick adoption of machine learning technology in a broad range of application scenarios. With such broad deployment, these approaches become part of the attack surface of modern IT infrastructures and therefore new privacy and security risks emerge. Hence, we research attack vectors and defenses of such intelligent systems built on AI and machine learning technology. In particular, I will talk about our latest work on membership inference and model stealing.

Fake News During the 2016 U.S. Presidential Elections: Prevalence, Agenda, and Stickiness.

Ceren Budak University of Michigan

10 Jun 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 005
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

The spread of fake news was one of the most discussed characteristics of the 2016 U.S. Presidential Election.Â The concerns regarding fake news have garnered significant attention in both media and policy circles, with some journalists even going as far as claiming that results of the 2016 election were a consequence of the spread of fake news. Yet, little is known about the prevalence and focus of such content, how its prevalence changed over time, and how this prevalence related to important election dynamics. ...

The spread of fake news was one of the most discussed characteristics of the 2016 U.S. Presidential Election.Â The concerns regarding fake news have garnered significant attention in both media and policy circles, with some journalists even going as far as claiming that results of the 2016 election were a consequence of the spread of fake news. Yet, little is known about the prevalence and focus of such content, how its prevalence changed over time, and how this prevalence related to important election dynamics. In this talk, I will address these questions by examining social media, news media, and interview data. These datasets allow examining the interplay between news media production and consumption, social media behavior, and the information the electorate retained about the presidential candidates leading up to the election.

Bridging the Performance Gap in Digital Geometry Processing

Rhaleb Zayer MPI-INF - D4

05 Jun 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

As the computing landscape is being reshaped by the dramatic shift towards ubiquitous parallelism, and by the sheer scale of data, extracting performance from existing applications gives rise to formidable challenges. In digital geometry processing, the problem gets amplified by data irregularity (e.g. meshes) and the predominately serial nature of traditional algorithmic solutions.Â As a results the gap between the high performance promise of modern hardware and the actual performance seems to grow wider.

In this talk, ...

As the computing landscape is being reshaped by the dramatic shift towards ubiquitous parallelism, and by the sheer scale of data, extracting performance from existing applications gives rise to formidable challenges. In digital geometry processing, the problem gets amplified by data irregularity (e.g. meshes) and the predominately serial nature of traditional algorithmic solutions.Â As a results the gap between the high performance promise of modern hardware and the actual performance seems to grow wider.

In this talk, I will discuss the impact of data structures and problem abstraction on performance. In particular, I will outline how high performance can be gained through a lean data representation which allows channeling parallelism through linear algebra kernels regardless of the underlying granularity. I will illustrate the impact of problem abstraction on challenging and far reaching scenarios including Voronoi diagrams (VD)/centroidal Voronoi tessellations (CVT) on surface meshes, subdivision surfaces, as well as matrix assembly in finite element analysis.

Automated Test Generation: A Journey from Symbolic Execution to Smart Fuzzing and Beyond

Koushik Sen UC Berkeley

04 Jun 2019, 10:30 am - 11:45 am

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Distinguished Lecture Series

In the last two decades, automation has had a significant impact on software testing and analysis. Automated testing techniques, such as symbolic execution, concolic testing, and feedback-directed fuzzing, have found numerous critical faults, security vulnerabilities, and performance bottlenecks in mature and well-tested software systems. The key strength of automated techniques is their ability to quickly search state spaces by performing repetitive and expensive computational tasks at a rate far beyond the human attention span and computation speed. ...

In the last two decades, automation has had a significant impact on software testing and analysis. Automated testing techniques, such as symbolic execution, concolic testing, and feedback-directed fuzzing, have found numerous critical faults, security vulnerabilities, and performance bottlenecks in mature and well-tested software systems. The key strength of automated techniques is their ability to quickly search state spaces by performing repetitive and expensive computational tasks at a rate far beyond the human attention span and computation speed. In this talk, I will give a brief overview of our past and recent research contributions in automated test generation using symbolic execution, program analysis, constraint solving, and fuzzing. I will also describe a new technique, called constraint-directed fuzzing, where given a pre-condition on a program as a logical formula, we can efficiently generate millions of test inputs satisfying the pre-condition.

High Performance Operating Systems in the Data Center

Tom Anderson University of Washington

31 May 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

The ongoing shift of enterprise computing to the cloud provides an opportunity to rethink operating systems for this new setting. I will discuss two specific technologies, kernel bypass for high performance networking and low latency non-volatile storage, and their implications for operating system design. In each case, delivering the performance of the underlying hardware requires novel approaches to the division of labor between hardware, the operating system kernel, and the application library.

Systematic Approach to Managing Software Defined Networks

Theopilus Benson Brown University

23 May 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

Software-defined Networks and programmable data-planes represent a shift in networking paradigm, which enables novel applications. Despite the growing interest and adoption of SDNs, SDNs remain plagued with availability and performance problems. In this talk, I discuss recent and ongoing work by my group to analyze these paradigms and create systematic abstractions that provide control over performance and availability. First, I will discuss Tardis a system that improves fault tolerance by leveraging the novel programmability provided by SDNs to identify and transform the failure-inducing event(s). ...

Software-defined Networks and programmable data-planes represent a shift in networking paradigm, which enables novel applications. Despite the growing interest and adoption of SDNs, SDNs remain plagued with availability and performance problems. In this talk, I discuss recent and ongoing work by my group to analyze these paradigms and create systematic abstractions that provide control over performance and availability. First, I will discuss Tardis a system that improves fault tolerance by leveraging the novel programmability provided by SDNs to identify and transform the failure-inducing event(s). Second, I will discuss a pair of projects, Hermes and SCC, that revisits traditional storage principles and applies them to network updates. Through this work, I will demonstrate how the centralization and programmability offered by SDNs enables us to more systematically reason about traditional networking issues such as availability and performance.

On the Predictability of Heterogeneous SoC Multicore Platforms

Dr Giovani Gracioli Technical University Munich

20 May 2019, 10:30 am - 12:00 pm

Saarbrücken building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

Multiprocessor Systems-on-Chip (MPSoC) integrating hard processing cores with programmable logic (PL) are becoming increasingly common. While these platforms have been originally designed for high performance computing applications, their rich feature set can be exploited to efficiently implement mixed criticality domains serving both critical hard real-time tasks, as well as soft real-time tasks.

In this talk, we show how one can tailor these MPSoCs to support a mixed criticality system, where cores are strictly isolated to avoid contention on shared resources such as Last-Level Cache (LLC) and main memory. ...

Multiprocessor Systems-on-Chip (MPSoC) integrating hard processing cores with programmable logic (PL) are becoming increasingly common. While these platforms have been originally designed for high performance computing applications, their rich feature set can be exploited to efficiently implement mixed criticality domains serving both critical hard real-time tasks, as well as soft real-time tasks.

In this talk, we show how one can tailor these MPSoCs to support a mixed criticality system, where cores are strictly isolated to avoid contention on shared resources such as Last-Level Cache (LLC) and main memory. We present and discuss a set of software and hardware techniques to improve the predictability using a modern MPSoC platform. We evaluate our techniques using an image processing application and show the maximum supported processing frequency.

Transparent Scaling of Deep Learning Systems through Dataflow Graph Analysis

Jinyang Li New York University

17 May 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Distinguished Lecture Series

As deep learning research pushes towards using larger and more sophisticated models, system infrastructure must use many GPUs efficiently. Analyzing the dataflow graph that represents the DNN computation is a promising avenue for optimization.Â By specializing execution for a given dataflow graph, we can accelerate DNN computation in ways that are transparent to programmers.Â In this talk, I show the benefits of dataflow graph analysis by discussing two recent systems that we've built to support large model training and low-latency inference. ...

As deep learning research pushes towards using larger and more sophisticated models, system infrastructure must use many GPUs efficiently. Analyzing the dataflow graph that represents the DNN computation is a promising avenue for optimization.Â By specializing execution for a given dataflow graph, we can accelerate DNN computation in ways that are transparent to programmers.Â In this talk, I show the benefits of dataflow graph analysis by discussing two recent systems that we've built to support large model training and low-latency inference. To train very large DNN models, Tofu automatically re-writes a dataflow graph of tensor operators into an equivalent parallel graph in which each original operator can be executed in parallel across multiple GPUs.Â To achieve low-latency inference, Batchmaker discovers identical sub-graph computation among different requests to enable batched execution of requests arriving at different times.Â

Humans and Machines: From Data Elicitation to Helper-AI

Goran Radanovic Harvard University

16 May 2019, 2:00 pm - 3:30 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

Recent AI advances have been driven by high-quality input data, often labeled by human annotators. A fundamental challenge in eliciting high-quality information from humans is that there is often no way to directly verify the quality of the information they provide. Consider, for example, product reviews and marketing surveys where data is inherently subjective, environmental community sensing where data is highly localized, and geopolitical forecasting where the ground truth is revealed in the distant future. In these settings, ...

Recent AI advances have been driven by high-quality input data, often labeled by human annotators. A fundamental challenge in eliciting high-quality information from humans is that there is often no way to directly verify the quality of the information they provide. Consider, for example, product reviews and marketing surveys where data is inherently subjective, environmental community sensing where data is highly localized, and geopolitical forecasting where the ground truth is revealed in the distant future. In these settings, data elicitation has to rely on peer-consistency mechanisms, which incentivize high-quality reporting by examining the consistency between the reports of different data providers. In this talk, I will discuss some of the recent advances in peer-consistency designs. Furthermore, I will outline some thoughts on an agenda around the design of human-AI collaborative systems.

Programming Abstractions for Verifiable Software

Damien Zufferey Max Planck Institute for Software Systems

15 May 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

In this talk, I will show how we can harness the synergy between programming languages and verification methods to help programmers build reliable software. First, we will look at fault-tolerant distributed algorithms. These algorithms are central to any high-availability application but they are notoriously difficult to implement due to asynchronous communication and faults. A fault- tolerant consensus algorithms which can be described in ~50 lines of pseudo code can easily turns into a few thousand lines of actual code. ...

In this talk, I will show how we can harness the synergy between programming languages and verification methods to help programmers build reliable software. First, we will look at fault-tolerant distributed algorithms. These algorithms are central to any high-availability application but they are notoriously difficult to implement due to asynchronous communication and faults. A fault- tolerant consensus algorithms which can be described in ~50 lines of pseudo code can easily turns into a few thousand lines of actual code. To remediate this, I will introduce PSync a domain specific language for fault-tolerant distributed algorithms. The key insight is the use of communication-closure (logical boundaries in a program that messages should not cross) to structure the code. Communication-closure gives a syntactic scope to the communication, provides some form of logical time, and give the illusion of synchrony. These element greatly simplify the programming and verification of fault-tolerant algorithms. In the second part of the talk, we will discuss a new project exploring how advances in rapid prototyping (3D printers) may impact how we develop software for robots. These advances may soon be enable adding computational elements as part of the internal structure of objects. The goal of this project is to rethink the software/hardware boundary and integrate the two together. I will present a system we are developing where components integrate for geometry (hardware) and behavior (software). The system allows from bottom-up composition and top-down decomposition. The bottom-up composition connects components together to achieve more complex behaviors. The top-down decomposition project a global specification on the individual components and performs verification at the level of individual components.

Conclave: Secure Multi-Party Computation on Big Data

Nikolaj Volgushev Boston University

15 May 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

Secure Multi-Party Computation (MPC) allows mutually distrusting parties to run joint computations without revealing private data. Current MPC algorithms scale poorly with data size, which makes MPC on "big data" prohibitively slow and inhibits its practical use. Many relational analytics queries can maintain MPCâ€™s end-to-end security guarantee without using cryptographic MPC techniques for all operations. Conclave is a query compiler that accelerates such queries by transforming them into a combination of data-parallel, local cleartext processing and small MPC steps. ...

Secure Multi-Party Computation (MPC) allows mutually distrusting parties to run joint computations without revealing private data. Current MPC algorithms scale poorly with data size, which makes MPC on "big data" prohibitively slow and inhibits its practical use. Many relational analytics queries can maintain MPCâ€™s end-to-end security guarantee without using cryptographic MPC techniques for all operations. Conclave is a query compiler that accelerates such queries by transforming them into a combination of data-parallel, local cleartext processing and small MPC steps. When parties trust others with specific subsets of the data, Conclave applies new hybrid MPC-cleartext protocols to run additional steps outside of MPC and improve scalability further. Our Conclave prototype generates code for cleartext processing in Python and Spark, and for secure MPC using the Sharemind and Obliv-C frameworks. Conclave scales to data sets between three and six orders of magnitude larger than state-of-the-art MPC frameworks support on their own. Thanks to its hybrid protocols and additional optimizations, Conclave also substantially outperforms SMCQL, the most similar existing system.

A constructive proof of dependent choice in classical arithmetic via memoization

Ã‰tienne Miquey Inria, Nantes

09 May 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

In 2012, Herbelin developed a calculus (dPAÏ‰) in which constructive proofs for the axioms of countable and dependent choices can be derived via the memoization of choice functions However, the property of normalization (and therefore the one of soundness) was only conjectured. The difficulty for the proof of normalization is due to the simultaneous presence of dependent dependent types (for the constructive part of the choice), of control operators (for classical logic), of coinductive objects (to encode functions of type â„•â†’A into streams (aâ‚€,aâ‚,â€¦)) and of lazy evaluation with sharing (for these coinductive objects). ...

In 2012, Herbelin developed a calculus (dPAÏ‰) in which constructive proofs for the axioms of countable and dependent choices can be derived via the memoization of choice functions However, the property of normalization (and therefore the one of soundness) was only conjectured. The difficulty for the proof of normalization is due to the simultaneous presence of dependent dependent types (for the constructive part of the choice), of control operators (for classical logic), of coinductive objects (to encode functions of type â„•â†’A into streams (aâ‚€,aâ‚,â€¦)) and of lazy evaluation with sharing (for these coinductive objects). Building on previous works, we introduce a variant of dPAÏ‰ presented as a sequent calculus. On the one hand, we take advantage of a variant of Krivine classical realizability we developed to prove the normalization of classical call-by-need. On the other hand, we benefit of dLtp, a classical sequent calculus with dependent types in which type safety is ensured using delimited continuations together with a syntactic restriction. By combining the techniques developed in these papers, we manage to define a realizability interpretation Ã la Krivine of our calculus that allows us to prove normalization and soundness. This talk will go over the whole process, starting from Herbelinâ€™s calculus dPAÏ‰ until the introduction of its sequent calculus counterpart dLPAÏ‰ that we prove to be sound.

Combinatorial Constructions for Effective Testing

Filip Nikšic Max Planck Institute for Software Systems

03 May 2019, 3:30 pm - 4:45 pm

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Student Defense Talks - Thesis Defense

Large-scale distributed systems consist of a number of components, take a number of parameter values as input, and behave differently based on a number of non-deterministic events. All these featuresâ€”components, parameter values, and eventsâ€”interact in com- plicated ways, and unanticipated interactions may lead to bugs. Empirically, many bugs in these systems are caused by interactions of only a small number of features. In certain cases, it may be possible to test all interactions of k features for a small constant k by executing a family of tests that is exponentially or even doubly-exponentially smaller than the family of all tests. ...

Large-scale distributed systems consist of a number of components, take a number of parameter values as input, and behave differently based on a number of non-deterministic events. All these featuresâ€”components, parameter values, and eventsâ€”interact in com- plicated ways, and unanticipated interactions may lead to bugs. Empirically, many bugs in these systems are caused by interactions of only a small number of features. In certain cases, it may be possible to test all interactions of k features for a small constant k by executing a family of tests that is exponentially or even doubly-exponentially smaller than the family of all tests. Thus, in such cases we can effectively uncover all bugs that require up to k-wise interactions of features.

In this thesis we study two occurrences of this phenomenon. First, many bugs in distributed systems are caused by network partition faults. In most cases these bugs occur due to two or three key nodes, such as leaders or replicas, not being able to communicate, or because the leading node finds itself in a block of the partition without quorum. Second, bugs may occur due to unexpected schedules (interleavings) of concurrent eventsâ€” concurrent exchange of messages and concurrent access to shared resources. Again, many bugs depend only on the relative ordering of a small number of events. We call the smallest number of events whose ordering causes a bug the depth of the bug. We show that in both testing scenarios we can effectively uncover bugs involving small number of nodes or bugs of small depth by executing small families of tests.

We phrase both testing scenarios in terms of an abstract framework of tests, testing goals, and goal coverage. Sets of tests that cover all testing goals are called covering families. We give a general construction that shows that whenever a random test covers a fixed goal with sufficiently high probability, a small randomly chosen set of tests is a covering family with high probability. We then introduce concrete coverage notions relating to network partition faults and bugs of small depth. In case of network partition faults, we show that for the introduced coverage notions we can find a lower bound on the probability that a random test covers a given goal. Our general construction then yields a randomized testing procedure that achieves full coverageâ€”and hence, find bugsâ€” quickly.

In case of coverage notions related to bugs of small depth, if the events in the program form a non-trivial partial order, our general construction may give a suboptimal bound. Thus, we study other ways of constructing convering families. We show that if the events in a concurrent program are partially ordered as a tree, we can explicitly construct a covering family of small size: for balanced trees, our construction is polylogarithmic in the number of events. For the case when the partial order of events does not have a "nice" structure, and the events and their relation to previous events are revealed while the program is running, we give an online construction of covering families. Based on the construction, we develop a randomized scheduler called PCTCP that uniformly samples schedules from a covering family and has a rigorous guarantee of finding bugs of small depth. We experiment with an implementation of PCTCP on two real-world distributed systemsâ€”Zookeeper and Cassandraâ€”and show that it can effectively find bugs.

Sharing-Aware Resource Management for Performance and Protection

Sandhya Dwarkadas Department of Computer Science, University of Rochester

02 May 2019, 10:00 am - 11:00 am

Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Distinguished Lecture Series

Recognizing that applications (whether in mobile, desktop, or server environments) are rarely executed in isolation today, I will discuss some practical challenges in making best use of available hardware and our approach to addressing these challenges. I will describe two independent and complementary control mechanisms using low-overhead hardware performance counters that we have developed: a sharing- and resource-aware mapper (SAM) to effect task placement with the goal of localizing shared data communication and minimizing resource contention based on the offered load; ...

Recognizing that applications (whether in mobile, desktop, or server environments) are rarely executed in isolation today, I will discuss some practical challenges in making best use of available hardware and our approach to addressing these challenges. I will describe two independent and complementary control mechanisms using low-overhead hardware performance counters that we have developed: a sharing- and resource-aware mapper (SAM) to effect task placement with the goal of localizing shared data communication and minimizing resource contention based on the offered load; and an application parallelism manager (MAPPER) that controls the offered load with the goal of improving system parallel efficiency. If time permits, I will also outline our work on streamlining instruction memory management and address translation to eliminate redundancy and improve efficiency, especially in mobile environments. Our results emphasize the need for low-overhead monitoring of application behavior under changing environmental conditions in order to adapt to environment and application behavior changes.

Edge Computing in the Extreme and its Applications

Suman Banerjee University of Wisconsin-Madison

30 Apr 2019, 1:00 pm - 2:30 pm

Saarbrücken building E1 5, room 105
simultaneous videocast to Kaiserslautern building G26, room 111 / Meeting ID: 6312

SWS Colloquium

The notion of edge computing introduces new computing functions away from centralized locations and closer to the network edge and thus facilitating new applications and services. This enhanced computing paradigm is provides new opportunities to applications developers, not available otherwise. In this talk, I will discuss why placing computation functions at the extreme edge of our network infrastructure, i.e., in wireless Access Points and home set-top boxes, is particularly beneficial for a large class of emerging applications. ...

The notion of edge computing introduces new computing functions away from centralized locations and closer to the network edge and thus facilitating new applications and services. This enhanced computing paradigm is provides new opportunities to applications developers, not available otherwise. In this talk, I will discuss why placing computation functions at the extreme edge of our network infrastructure, i.e., in wireless Access Points and home set-top boxes, is particularly beneficial for a large class of emerging applications. I will discuss a specific approach, called ParaDrop, to implement such edge computing functionalities, and use examples from different domains -- smarter homes, sustainability, and intelligent transportation -- to illustrate the new opportunities around this concept.

Querying Regular Languages over Sliding Windows

Moses Ganardi University of Siegen

29 Apr 2019, 2:30 pm - 3:30 pm

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

A sliding window algorithm for a language L receives a stream of symbols and has to decide at each time step whether the suffix of length n belongs to L or not. The window size n is either a fixed number (in the fixed-size model) or can be controlled online by an adversary (in the variable-size model). In this talk we give a survey on recent results for deterministic and randomized sliding window algorithms for regular languages.

The complexity of reachability in vector addition systems

Sylvain Schmitz Ã‰cole Normale Supérieure Paris-Saclay

05 Apr 2019, 2:00 pm - 3:00 pm

Saarbrücken building G26, room 111
simultaneous videocast to Kaiserslautern building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

The last few years have seen a surge of decision problems with an astronomical, non primitive-recursive complexity, in logic, verification, and games. While the existence of such uncontrived problems is known since the early 1980s, the field has matured with techniques for proving complexity lower and upper bounds and the definition of suitable complexity classes. This framework has been especially successful for analysing so-called `well-structured systems'---i.e., transition systems endowed with a well-quasi-order, which underly most of these astronomical problems---, ...

The last few years have seen a surge of decision problems with an astronomical, non primitive-recursive complexity, in logic, verification, and games. While the existence of such uncontrived problems is known since the early 1980s, the field has matured with techniques for proving complexity lower and upper bounds and the definition of suitable complexity classes. This framework has been especially successful for analysing so-called `well-structured systems'---i.e., transition systems endowed with a well-quasi-order, which underly most of these astronomical problems---, but it turns out to be applicable to other decision problems that resisted analysis, including two famous problems: reachability in vector addition systems and bisimilarity of pushdown automata. In this talk, I will explain how some of these techniques apply to reachability in vector addition systems, yielding tight Ackermannian upper bounds for the decomposition algorithm initially invented by Mayr, Kosaraju, and Lambert; this will be based on joint work with Jérôme Leroux.

Worst-Case Execution Time Guarantees for Runtime-Reconfigurable Architectures

Marvin Damschen Karlsruhe Institute of Technology

04 Apr 2019, 2:00 pm - 3:00 pm

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029 / Meeting ID: 6312

SWS Colloquium

Real-time embedded systems need to be analyzable for execution time guarantees. Despite significant scientific advances, however, timing analysis lags years behind current microarchitectures with out-of-order scheduling pipelines, several hardware threads and multiple (shared)cache layers. To satisfy the increasing demand for predictable performance, analyzable performance features are required. We introduce runtime-reconfigurable instruction set processors as one way to escape the scarcity of analyzable performance features while preserving the flexibility of the system. To this end, we first present a reconfiguration controller for guaranteed reconfiguration delays of accelerators onto an FPGA. ...

Real-time embedded systems need to be analyzable for execution time guarantees. Despite significant scientific advances, however, timing analysis lags years behind current microarchitectures with out-of-order scheduling pipelines, several hardware threads and multiple (shared)cache layers. To satisfy the increasing demand for predictable performance, analyzable performance features are required. We introduce runtime-reconfigurable instruction set processors as one way to escape the scarcity of analyzable performance features while preserving the flexibility of the system. To this end, we first present a reconfiguration controller for guaranteed reconfiguration delays of accelerators onto an FPGA. We propose a novel timing analysis approach to obtain worst-case execution time (WCET) guarantees for applications that utilize runtime-reconfigurable custom instructions (CIs), which each utilize one or more accelerators. Given the constrained reconfigurable area of an FPGA, we solve the problem of selecting CIs for each computational kernel of an application to optimize its worst-case execution time. Finally, we show that runtime reconfiguration provides the unique feature of optimized static WCET guarantees and optimization of the average-case execution time (maintaining statically-given WCET guarantees) by repurposing reconfigurable area for different selections of CIs at runtime.

The Server-to-Server Landscape: Insights, Opportunities, and Challenges

Balakrishnan Chandrasekaran MPI-INF - D3

03 Apr 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

A significant fraction of the Internet traffic today is delivered via content delivery networks (CDNs). Typically, the CDN hauls data from content providers to its back-end servers, moves this data (via an overlay) to its front-end servers, and serves the data from there to the end users. This server-to-server (i.e., back-end to front-end) landscape provides a few unique perspectives as well as opportunities for solving some long-standing networking problems. In this talk, I will briefly discuss these perspectives and opportunities, ...

A significant fraction of the Internet traffic today is delivered via content delivery networks (CDNs). Typically, the CDN hauls data from content providers to its back-end servers, moves this data (via an overlay) to its front-end servers, and serves the data from there to the end users. This server-to-server (i.e., back-end to front-end) landscape provides a few unique perspectives as well as opportunities for solving some long-standing networking problems. In this talk, I will briefly discuss these perspectives and opportunities, and present one key measurement effort that leverages this landscapeâ€”quantifying the effect of path choices in the Internetâ€™s core on end-userâ€™s experiences.

Analyzing Sample Correlations for Monte Carlo Rendering

Gurprit Singh MPI-INF - D4

13 Mar 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Point patterns and stochastic structures lie at the heart of Monte Carlo based numerical integration schemes. Physically based rendering algorithms have largely benefited from these Monte Carlo based schemes that inherently solve very high dimensional light transport integrals. However, due to the underlying stochastic nature of the samples, the resultant images are corrupted with noise (unstructured aliasing or variance). This also results in bad convergence rates that prohibit using these techniques in more interactive environments (e.g. ...

Point patterns and stochastic structures lie at the heart of Monte Carlo based numerical integration schemes. Physically based rendering algorithms have largely benefited from these Monte Carlo based schemes that inherently solve very high dimensional light transport integrals. However, due to the underlying stochastic nature of the samples, the resultant images are corrupted with noise (unstructured aliasing or variance). This also results in bad convergence rates that prohibit using these techniques in more interactive environments (e.g. games, virtual reality). With the advent of smart rendering techniques and powerful computing units (CPUs/GPUs), it is now possible to perform physically based rendering at interactive rates. However, much is left to understand regarding the underlying sampling structures and patterns which are the primary cause of error in rendering.Â In this talk, we first revisit theÂ most recent state-of-the-art frameworks that are developed to better understand the impact of samplesâ€™ structure on the error and its convergence during Monte Carlo integration. Towards the end, we briefly present ourÂ deep learning based approach to generate these samples with correlations.

Feedback-Control for Self-Adaptive Predictable Computing

Martina Maggio Lund University

13 Mar 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Colloquium

Cloud computing gives the illusion of infinite computational capacity and allows for on-demand resource provisioning. As a result, over the last few years, the cloud computing model has experienced widespread industrial adoption and companies like Netflix offloaded their entire infrastructure to the cloud. However, with even the largest datacenter being of a finite size, cloud infrastructures have experienced overload due to overbooking or transient failures. In essence, this is an excellent opportunity for the design of control solutions, ...

Cloud computing gives the illusion of infinite computational capacity and allows for on-demand resource provisioning. As a result, over the last few years, the cloud computing model has experienced widespread industrial adoption and companies like Netflix offloaded their entire infrastructure to the cloud. However, with even the largest datacenter being of a finite size, cloud infrastructures have experienced overload due to overbooking or transient failures. In essence, this is an excellent opportunity for the design of control solutions, that tackle the problem of mitigating overload peaks, using feedback from the infrastructure. These solutions can then exploit control-theoretical principles and take advantage of the knowledge and the analysis capabilities of control tools to provide formal guarantees on the predictability of the infrastructure behavior. This talk introduces recent research advances on feedback control in the cloud computing domain, together with my research agenda for enhancing predictability and formal guarantees for cloud computing.

Automated Resource Management in Large-Scale Networked Systems

Sangeetha Abdu Jyothi University of Illinois

11 Mar 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Colloquium

Internet applications rely on large-scale networked environments such as the cloud for their backend support. In these multi-tenanted environments, various stakeholders have diverse goals. The objective of the infrastructure provider is to increase revenue by utilizing the resources efficiently. Applications, on the other hand, want to meet their performance requirements at minimal cost. However, estimating the exact amount of resources required to meet the application needs is a difficult task, even for expert users. Easy workarounds employed for tackling this problem, ...

Internet applications rely on large-scale networked environments such as the cloud for their backend support. In these multi-tenanted environments, various stakeholders have diverse goals. The objective of the infrastructure provider is to increase revenue by utilizing the resources efficiently. Applications, on the other hand, want to meet their performance requirements at minimal cost. However, estimating the exact amount of resources required to meet the application needs is a difficult task, even for expert users. Easy workarounds employed for tackling this problem, such as resource over-provisioning, negatively impact the goals of the provider, applications, or both. In this talk, I will discuss the design of application-aware self-optimizing systems through automated resource management that helps meet the varied goals of the provider and applications in large-scale networked environments. The key steps in closed-loop resource management include learning of application resource needs, efficient scheduling of resources, and adaptation to variations in real time. I will describe how I apply this high-level approach in two distinct environments using (a) Morpheus in enterprise clusters, and (b) Patronus in cellular provider networks with geo-distributed micro data centers. I will also touch upon my related work in application-specific context at the intersection of network scheduling and deep learning. I will conclude with my vision for self-optimizing systems including fully automated clouds and an elastic geo-distributed platform for thousands of micro data centers.

Predictable Execution of Real-Time Applications on Many-Core Platforms

Matthias Becker KTH Royal Institute of Technology

08 Mar 2019, 10:30 am - 11:30 am

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029

SWS Colloquium

Nowadays, innovation in many industrial areas is software driven, where existing software functions become more complex and new software functions are constantly introduced. The rapid increase in functionality comes along with a steep increase in software complexity. To cope with this transition, current trends shift away from todayâ€™s distributed architectures towards integrated architectures. Here, previously distributed functionality is consolidated on fewer, more powerful, computers. Such a trend can for example be observed in the automotive or avionics domain. ...

Nowadays, innovation in many industrial areas is software driven, where existing software functions become more complex and new software functions are constantly introduced. The rapid increase in functionality comes along with a steep increase in software complexity. To cope with this transition, current trends shift away from todayâ€™s distributed architectures towards integrated architectures. Here, previously distributed functionality is consolidated on fewer, more powerful, computers. Such a trend can for example be observed in the automotive or avionics domain. This can ease the integration process, reduce the hardware complexity, and ultimately save costs. One promising hardware platform for these powerful embedded computers is the many-core processor. A many-core processor hosts a vast number of compute cores, that are partitioned on clusters which are connected by a Network-on-Chip. However, ensuring that real-time requirements are satisfied in the presence of contention in shared resources, such as memories, remains an open issue. In addition, industrial applications are often subject to timing constraints on the data propagation through a chain of semantically related tasks. Such requirements pose challenges to the system designer as they are only able to verify them after the system synthesis (i.e. very late in the design process). In this talk, we present methods that transform timing constraints on the data propagation delay into precedence constraints between individual task instances. An execution framework for the cluster of the many-core is proposed that allows access to cluster external memory while it avoids contention on shared resources by design. Spatial and temporal isolation between different clusters is provided by a partitioning and configuration of the Network-on-Chip that further reduces the worst-case access times to external memory.

Mitigating data leaks in real world systems

Aastha Mehta Max Planck Institute for Software Systems

07 Mar 2019, 4:30 pm - 5:30 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Proposal

Unintended data disclosures are a major concern for many online services, such as healthcare systems, government departments, and web services. Data may leak over explicit output channels of the systems, for instance due to accidental bugs and misconfigurations in the system. Data may also leak over various side channels, for instance, in a cloud environment where a tenant shares the Cloud providerâ€™s infrastructure with other mutually distrusting tenants.

In this thesis, we address the problem of unintended data disclosures in web services due to both types of causes, ...

Unintended data disclosures are a major concern for many online services, such as healthcare systems, government departments, and web services. Data may leak over explicit output channels of the systems, for instance due to accidental bugs and misconfigurations in the system. Data may also leak over various side channels, for instance, in a cloud environment where a tenant shares the Cloud providerâ€™s infrastructure with other mutually distrusting tenants.

In this thesis, we address the problem of unintended data disclosures in web services due to both types of causes, i.e. explicit leaks and side channel leaks. Specifically, we propose a system to mitigate explicit leaks due to accidental bugs in database-backed services; and a system to mitigate network side channel leaks in the tenants of an infrastructure-as-a-service (IaaS) Cloud.

In this talk, I will first present a high level overview of the design, implementation, and evaluation of Qapla, which is a system to ensure policy compliance in database-backed services.

Then I will present our ongoing work on the design, implementation, and evaluation of Pacer, which is a system to mitigate network side channels in Cloud tenants. Pacer mitigates network side channels using traffic shaping. Pacer provides a generic abstraction of a traffic shaping tunnel, which encapsulates the tenant's network traffic, and shapes it to make it independent of the tenant's secrets. We present a prototype with Pacer's tunnel endpoints integrated in the Cloud hypervisor and the client OS. Our preliminary evaluation shows that Pacer can enforce traffic shaping securely, while incurring modest overheads on bandwidth, client latencies, and server throughput.

Privacy, Transparency and Trust in the User-Centric Internet

Oana Goga Université Grenoble Alpes

07 Mar 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Colloquium

The rise of user-centric Internet systems such as Facebook or Twitter brought security and privacy threats that became out of control in recent years. To make such systems more dependable, my research focuses on three key aspects: (1) privacy: ensure users understand and can control the information that is disclosed about them; (2) transparency: ensure users understand how their data is being used and how it affects the services they receive; and (3) trust: ensure users can evaluate the trustworthiness of content consumed from these systems.Â ...

The rise of user-centric Internet systems such as Facebook or Twitter brought security and privacy threats that became out of control in recent years. To make such systems more dependable, my research focuses on three key aspects: (1) privacy: ensure users understand and can control the information that is disclosed about them; (2) transparency: ensure users understand how their data is being used and how it affects the services they receive; and (3) trust: ensure users can evaluate the trustworthiness of content consumed from these systems.Â

In this talk, I will share my research efforts in understanding and tackling security and privacy threats in social media targeted advertising. Despite a number of recent controversies regarding privacy violations, lack of transparency, or vulnerability to discrimination or propaganda by dishonest actors; users still have little understanding of what data targeted advertising platforms have about them and why they are shown the ads they see. To address such concerns, Facebook recently introduced the "Why am I seeing this?" button that provides users with an explanation of why they were shown a particular ad. I first investigate the level of transparency provided by this mechanism by empirically measuring whether it satisfies a number of key properties and what are the consequences of the current design choices. To provide a better understanding of the Facebook advertising ecosystem, we developed a tool called AdAnalyst that collects the ads users receive and provides aggregate statistics. I will then share our findings from analyzing data from over 600 real-world AdAnalyst users; in particular on who is advertising on Facebook and how these advertisers are targeting users and customizing ads via the platform.Â

Towards Literate Artificial Intelligence

Mrinmaya Sachan Carnegie Mellon University

05 Mar 2019, 10:30 am - 12:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Colloquium

Over the past decade, the field of artificial intelligence (AI) has seen striking developments. Yet, todayâ€™s AI systems sorely lack the essence of human intelligence i.e.Â our ability to (a) understand language and grasp its meaning, (b) assimilate common-sense background knowledge of the world, and (c) draw inferences and perform reasoning. Before we even begin to build AI systems that possess the aforementioned human abilities, we must ask an even more fundamental question: How would we even evaluate an AI system on the aforementioned abilities? ...

Over the past decade, the field of artificial intelligence (AI) has seen striking developments. Yet, todayâ€™s AI systems sorely lack the essence of human intelligence i.e.Â our ability to (a) understand language and grasp its meaning, (b) assimilate common-sense background knowledge of the world, and (c) draw inferences and perform reasoning. Before we even begin to build AI systems that possess the aforementioned human abilities, we must ask an even more fundamental question: How would we even evaluate an AI system on the aforementioned abilities? In this talk, I will argue that we can evaluate AI systems in the same way as we evaluate our children - by giving them standardized tests. Standardized tests are administered to students to measure the knowledge and skills gained by them. Thus, it is natural to use these tests to measure the intelligence of our AI systems. Then, I will describe Parsing to Programs (P2P), a framework that combines ideas from semantic parsing and probabilistic programming for situated question answering. We used P2P to build systems that can solve pre-university level Euclidean geometry and Newtonian physics examinations. P2P achieves a performance at least as well as the average student on questions from textbooks, geometry questions from previous SAT exams, and mechanics questions from Advanced Placement (AP) exams. I will conclude by describing implications of this research and some ideas for future work.

A Client-centric Approach to Transactional Datastores

Natacha Crooks University of Texas at Austin

28 Feb 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Colloquium

Modern applications must collect and store massive amounts of data. Cloud storage offers these applications simplicity: the abstraction of a failure-free, perfectly scalable black-box. While appealing, offloading data to the cloud is not without challenges. Cloud storage systems often favour weaker levels of isolation and consistency. These weaker guarantees introduce behaviours that, without care, can break application logic. Offloading data to an untrusted third party like the cloud also raises questions of security and privacy.

This talk summarises my efforts to improve the performance, ...

Modern applications must collect and store massive amounts of data. Cloud storage offers these applications simplicity: the abstraction of a failure-free, perfectly scalable black-box. While appealing, offloading data to the cloud is not without challenges. Cloud storage systems often favour weaker levels of isolation and consistency. These weaker guarantees introduce behaviours that, without care, can break application logic. Offloading data to an untrusted third party like the cloud also raises questions of security and privacy.

This talk summarises my efforts to improve the performance, the semantics and the security of transactional cloud storage systems. It centers around a simple idea: defining consistency guarantees from the perspective of the applications that observe these guarantees, rather than from the perspective of the systems that implement them. I will discuss how this new perspective brings forth several benefits. First, it offers simpler and cleaner definitions of weak isolation and consistency guarantees. Second, it enables more scalable implementations of existing guarantees like causal consistency. Finally, I will discuss its applications to security: our client-centric perspective allows us to add obliviousness guarantees to transactional cloud storage systems.

New Abstractions for High-Performance Datacenter Applications

Malte Schwarzkopf MIT CSAIL

18 Feb 2019, 10:30 am - 11:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Colloquium

Developing high-performance datacenter applications is complex and time-consuming today, as developers must understand and correctly implement subtle interactions between different backend systems. I describe a new approach that redesigns core datacenter systems around new abstractions: the right abstractions substantially reduce complexity while maintaining the same performance. This saves expensive developer time, uses datacenter servers more efficiently, and can enable new, previously impossible systems and applications. I illustrate the impact of such redesigns with Noria, which recasts web application backends-i.e., ...

Developing high-performance datacenter applications is complex and time-consuming today, as developers must understand and correctly implement subtle interactions between different backend systems. I describe a new approach that redesigns core datacenter systems around new abstractions: the right abstractions substantially reduce complexity while maintaining the same performance. This saves expensive developer time, uses datacenter servers more efficiently, and can enable new, previously impossible systems and applications. I illustrate the impact of such redesigns with Noria, which recasts web application backends-i.e., databases and caches-as a streaming dataflow computation based on a new abstraction of partial state. Noria's partially-stateful dataflow brings classic databases' familiar query flexibility to scalable dataflow systems, simplifying applications and improving the backend's efficiency. For example, Noria increases the request load handled by a single server by 5-70x compared to state-of-the-art backends. Additional new abstractions from my research increase the efficiency of other datacenter systems (e.g., cluster schedulers), or enable new kinds of systems that, for example, may help protect user data against exposure through application bugs.

Scalable positioning of commodity mobile devices using audio signals

Viktor Erdélyi Max Planck Institute for Software Systems

14 Feb 2019, 9:00 am - 10:30 am

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

This thesis explores the problem of computing a position map for co-located mobile devices. The positioning should happen in a scalable manner without requiring specialized hardware and without requiring specialized infrastructure (except basic Wi-Fi or cellular access). At events like meetings, talks, or conferences, a position map can aid spontaneous communication among users based on their relative position in two ways. First, it enables users to choose message recipients based on their relative position, which also enables the position-based distribution of documents. ...

This thesis explores the problem of computing a position map for co-located mobile devices. The positioning should happen in a scalable manner without requiring specialized hardware and without requiring specialized infrastructure (except basic Wi-Fi or cellular access). At events like meetings, talks, or conferences, a position map can aid spontaneous communication among users based on their relative position in two ways. First, it enables users to choose message recipients based on their relative position, which also enables the position-based distribution of documents. Second, it enables senders to attach their position to messages, which can facilitate interaction between speaker and audience in a lecture hall and enables the collection of feedback based on usersâ€™ location.

In this thesis, we present Sonoloc, a mobile app and system that, by relying on acoustic signals, allows a set of commodity smart devices to determine their relative positions. Sonoloc can position any number of devices within acoustic range with a constant number of acoustic signals emitted by a subset of devices. Our experimental evaluation with up to 115 devices in real rooms shows that â€“ despite substantial background noise â€“ the system can locate devices with an accuracy of tens of centimeters using no more than 15 acoustic signals.

Dynamic Symbolic Execution for Software Analysis

Cristian Cadar Imperial College London

07 Feb 2019, 10:30 am - 11:30 am

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029

SWS Distinguished Lecture Series

Symbolic execution is a program analysis technique that can automatically explore and analyse paths through a program. While symbolic execution was initially introduced in the seventies, it has only received significant attention during the last decade, due to tremendous advances in constraint solving technology and effective blending of symbolic and concrete execution into what is often called dynamic symbolic execution. Dynamic symbolic execution is now a key ingredient in many computer science areas, such as software engineering, ...

Symbolic execution is a program analysis technique that can automatically explore and analyse paths through a program. While symbolic execution was initially introduced in the seventies, it has only received significant attention during the last decade, due to tremendous advances in constraint solving technology and effective blending of symbolic and concrete execution into what is often called dynamic symbolic execution. Dynamic symbolic execution is now a key ingredient in many computer science areas, such as software engineering, computer security, and software systems, to name just a few. In this talk, I will discuss recent advances and ongoing challenges in the area of dynamic symbolic execution, drawing upon our experience developing several symbolic execution tools for many different scenarios, such as high-coverage test input generation, bug and security vulnerability detection, patch testing and bounded verification, among many others.

Machine Teaching

Adish Singla Max Planck Institute for Software Systems

06 Feb 2019, 12:15 pm - 1:15 pm

Saarbrücken building E1 5, room 002

Joint Lecture Series

Much of the research in machine learning has focused on designing algorithms for discovering knowledge from data and learning a target model. What if there is a ``teacher" who knows the target already, and wants to steer the ``learner" towards this target as quickly as possible?Â For instance, in an educational setting, the teacher (e.g., a tutoring system) has an educational goal that she wants to communicate to a student via a set of demonstrations and lessons; ...

Much of the research in machine learning has focused on designing algorithms for discovering knowledge from data and learning a target model. What if there is a ``teacher" who knows the target already, and wants to steer the ``learner" towards this target as quickly as possible?Â For instance, in an educational setting, the teacher (e.g., a tutoring system) has an educational goal that she wants to communicate to a student via a set of demonstrations and lessons; in adversarial attacks known as training-set poisoning, the teacher (e.g., a hacking algorithm) manipulates the behavior of a machine learning system by maliciously modifying the training data. This lecture will provide an overview of machine teaching and cover the following three aspects: (i) how machine teaching differs from machine learning, (ii) highlight the problem space of machine teaching across different dimensions, and (iii) discuss our recent work on developing teaching algorithms for human learners.

Techniques to Protect Confidentiality and Integrity of Persistant and In-Memory Data

Anjo Vahldiek-Oberwagner Max Planck Institute for Software Systems

05 Feb 2019, 5:30 pm - 6:30 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

Today computers store and analyze valuable and sensitive data. As a result we need to protect this data against confidentiality and integrity violations that can result in the illicit release, loss, or modification of a userâ€™s and an organizationâ€™s sensitive data such as personal media content or client records. Existing techniques protecting confidentiality and integrity lack either efficiency or are vulnerable to malicious attacks. In this thesis we suggest techniques, Guardat and ERIM, to efficiently and robustly protect persistent and in-memory data. ...

Today computers store and analyze valuable and sensitive data. As a result we need to protect this data against confidentiality and integrity violations that can result in the illicit release, loss, or modification of a userâ€™s and an organizationâ€™s sensitive data such as personal media content or client records. Existing techniques protecting confidentiality and integrity lack either efficiency or are vulnerable to malicious attacks. In this thesis we suggest techniques, Guardat and ERIM, to efficiently and robustly protect persistent and in-memory data. To protect the confidentiality and integrity of persistent data, clients specify per-file policies to Guardat declaratively, concisely and separately from code. Guardat enforces policies by mediating I/O in the storage layer. In contrast to prior techniques, we protect against accidental or malicious circumvention of higher software layers. We present the design and prototype implementation, and demonstrate that Guardat efficiently enforces example policies in a web server. To protect the confidentiality and integrity of in-memory data, ERIM isolates sensitive data using Intel Memory Protection Keys (MPK), a recent x86 extension to partition the address space. However, MPK does not protect against malicious attacks by itself. We prevent malicious attacks by combining MPK with call gates to trusted entry points and ahead-of-time binary inspection. In contrast to existing techniques, ERIM efficiently protects frequently-used session keys of web servers, an in-memory reference monitorâ€™s private state, and managed runtimes from native libraries. These use cases result in high switch rates of the order of 10 5 â€“10 6 switches/s. Our experiments demonstrate less then 1% runtime overhead per 100,000 switches/s, thus outperforming existing techniques.

Discrimination in Algorithmic Decision Making: From Principles to Measures and Mechanisms

Bilal Zafar Max Planck Institute for Software Systems

04 Feb 2019, 6:00 pm - 7:00 pm

Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111

SWS Student Defense Talks - Thesis Defense

The rise of algorithmic decision making in a variety of applications has also raised concerns about its potential for discrimination against certain social groups. However, incorporating nondiscrimination goals into the design of algorithmic decision making systems (or, classiï¬ers) has proven to be quite challenging. These challenges arise mainly due to the computational complexities involved in the process, and the inadequacy of existing measures to computationally capture discrimination in various situations. The goal of this thesis is to tackle these problems. ...

The rise of algorithmic decision making in a variety of applications has also raised concerns about its potential for discrimination against certain social groups. However, incorporating nondiscrimination goals into the design of algorithmic decision making systems (or, classiï¬ers) has proven to be quite challenging. These challenges arise mainly due to the computational complexities involved in the process, and the inadequacy of existing measures to computationally capture discrimination in various situations. The goal of this thesis is to tackle these problems.

First, with the aim of incorporating existing measures of discrimination (namely, disparate treatment and disparate impact) into the design of well-known classiï¬ers, we introduce a mechanism of decision boundary covariance, that can be included in the formulation of any convex boundary-based classiï¬er in the form of convex constraints. Second, we propose alternative measures of discrimination. Our ï¬rst proposed measure, disparate mistreatment, is useful in situations when unbiased ground truth training data is available. The other two measures, preferred treatment and preferred impact, are useful in situations when feature and class distributions of different social groups are signiï¬cantly different, and can additionally help reduce the cost of nondiscrimination (as compared to the existing measures). We also design mechanisms to incorporate these new measures into the design of convex boundary-based classiï¬ers.

How to Win a First-Order Safety Game

Helmut Seidl TUM

01 Feb 2019, 10:30 am - 11:30 am

Kaiserslautern building G26, room 111

SWS Distinguished Lecture Series

First-order (FO) transition systems have recently attracted attention for the verification of parametric systems such as network protocols, software-defined networks or multi-agent workflows. Desirable properties of these systems such as functional correctness or non-interference have conveniently been formulated as safety properties. Here, we go one step further. Our goal is to verify safety, and also to develop techniques for automatically synthesizing strategies to enforce safety. For that reason, we generalize FO transition systems to FO safety games. ...

First-order (FO) transition systems have recently attracted attention for the verification of parametric systems such as network protocols, software-defined networks or multi-agent workflows. Desirable properties of these systems such as functional correctness or non-interference have conveniently been formulated as safety properties. Here, we go one step further. Our goal is to verify safety, and also to develop techniques for automatically synthesizing strategies to enforce safety. For that reason, we generalize FO transition systems to FO safety games. We prove that the existence of a winning strategy of safety player in finite games is equivalent to second-order quantifier elimination. For monadic games, we provide a complete classification into decidable and undecidable cases. For games with non-monadic predicates, we concentrate on universal invariants only. We identify a non-trivial sub-class where safety is decidable. For the general case, we provide meaningful abstraction and refinement techniques for realizing a CEGAR based synthesis loop. Joint work with: Christian Müller, TUM Bernd Finkbeiner, Universität des Saarlandes

Automated Complexity Analysis of Rewrite Systems

Florian Frohn RWTH Aachen

22 Jan 2019, 10:00 am - 11:00 am

Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029

SWS Colloquium

Many approaches to analyze the complexity of programs automatically are based on transformations into integer or term rewrite systems. However, state-of-the-art tools that analyze the worst-case complexity of rewrite systems are restricted to the inference of upper bounds. In this talk, the first techniques for the inference of lower bounds on the worst-case complexity of integer and term rewrite systems are introduced. While upper bounds can prove the absence of performance-critical bugs, lower bounds can be used to find them. ...

Many approaches to analyze the complexity of programs automatically are based on transformations into integer or term rewrite systems. However, state-of-the-art tools that analyze the worst-case complexity of rewrite systems are restricted to the inference of upper bounds. In this talk, the first techniques for the inference of lower bounds on the worst-case complexity of integer and term rewrite systems are introduced. While upper bounds can prove the absence of performance-critical bugs, lower bounds can be used to find them.

For term rewriting, the power of the presented technique gives rise to the question whether the existence of a non-constant lower bound is decidable. Thus, the corresponding decidability results are also discussed in this talk.

Finally, to see the practical value of complexity analysis techniques for rewrite systems, we will have a glimpse at the transformation from Java programs to integer rewrite systems that is implemented in the tool AProVE.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Need help?