Recent events

Time for Text Mining and Information Retrieval

Jannik Str├Âtgen
MPI-INF - D5
Joint Lecture Series
11 Jan 2017, 12:15 pm - 2:15 pm
Saarbr├╝cken building E1 5, room 002
Temporal information is an important aspect in any information space and has important characteristics that make it highly eligible to be exploited in diverse text mining and information retrieval scenarios. For quite a long time only the metadata of text documents (e.g. publication dates) has been considered but with recent improvements in natural language processing (NLP) temporal expressions occurring in the content of documents (e.g. "January 2016" "next week") can be extracted and interpreted efficiently with high quality. This is particularly valuable as in many types of documents temporal expressions do not only occur frequently but also play a crucial role e.g. to anchor events in time.

In this talk after explaining the basics and challenges of the NLP task "temporal tagging" I present our approach to enrich documents with temporal information. Then I showcase several application scenarios in which we exploit automatically extracted temporal annotations for search and exploration tasks. These range from temporal information retrieval for news articles via Wikipedia-based event extraction up to the exploration of fictitious happenings in literary texts in the context of digital humanities research.

Proving Performance Properties of Higher-order Functions with Memoization

Ravi Madhavan
EPFL
SWS Colloquium
20 Dec 2016, 11:30 am - 1:30 pm
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbr├╝cken building E1 5, room 029
Static verification of performance properties of programs is an important problem that has attracted great deal of research. However most existing tools infer best-effort upper bounds and hope that they match users expectations. In this talk I will present a system for specifying and verifying bounds on resources such as the number of evaluation steps and heap-allocated objects for functional Scala programs that may rely on lazy evaluation and memoization. In our system users can specify the desired resource bound as a template with numerical holes in the contracts of functions e.g. as "steps <= ? * size(list) + ?" along with other functional properties necessary for establishing the bounds. The system automatically infers values for the holes that will make the templates hold for all executions of the functions. For example the property that a function converting a propositional formula f into negation-normal form (NNF) takes time linear in the size of f can be expressed in the post-condition of the function using the predicate "steps <= ? * size(f) + ?" where size is a user-defined function counting the number of nodes in the syntax tree of the formula. Using our tool we have verified asymptotically precise bounds of several algorithms and data structures that rely on complex sharing of higher-order functions and memoization. Our benchmarks include balanced search trees like red-black trees and AVL trees Okasaki?s constant-time queues deques lazy data structures based on numerical representations such as lazy binomial heaps cyclic streams and dynamic programming algorithms. Some of the benchmarks have posed serious challenges to automatic as well as manual reasoning. The system is a part of the Leon verifier and can be tried online at "http://leondev.epfl.ch" ("Resource bounds" section).

Proving Performance Properties of Higher-order Functions with Memoization

Ravi Madhavan
EPFL
SWS Colloquium
20 Dec 2016, 11:30 am - 1:30 pm
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbr├╝cken building E1 5, room 029
Static verification of performance properties of programs is an important problem that has attracted great deal of research. However most existing tools infer best-effort upper bounds and hope that they match users expectations. In this talk I will present a system for specifying and verifying bounds on resources such as the number of evaluation steps and heap-allocated objects for functional Scala programs that may rely on lazy evaluation and memoization. In our system users can specify the desired resource bound as a template with numerical holes in the contracts of functions e.g. as "steps <= ? * size(list) + ?"  along with other functional properties necessary for establishing the bounds. The system automatically infers values for the holes that will make the templates hold for all executions of the functions.  For example the property that a function converting a propositional formula f into negation-normal form (NNF)  takes time linear in the size of f can be expressed in the post-condition of the function using the predicate "steps <= ? * size(f) + ?"   where size is a user-defined function counting the number of nodes in the syntax tree of the formula. Using our tool we have verified asymptotically precise bounds of several algorithms and data structures that rely on complex sharing of higher-order functions and memoization. Our benchmarks include balanced search trees like red-black trees and AVL trees Okasaki?s constant-time queues deques lazy data structures based on numerical representations such as lazy binomial heaps cyclic streams and dynamic programming algorithms. Some of the benchmarks have posed serious challenges to automatic as well as manual reasoning. The system is a part of the Leon verifier and can be tried online at "http://leondev.epfl.ch" ("Resource bounds" section).

References:   (a) Symbolic Resource Bound Inference For Functional Programs. Ravichandhran Madhavan and Viktor Kuncak. Computer Aided Verification CAV 2014 (b) Contract-based Resource Verification for Higher-order Functions with Memoization. Ravichandhran Madhavan Sumith Kulal and Viktor Kuncak. To appear in POPL 2017

Network Inference: Graph Reconstruction and Verification

Hang Zhou
MPI-INF - D1
Joint Lecture Series
07 Dec 2016, 12:15 pm - 2:15 pm
Saarbr├╝cken building E1 5, room 002
How efficiently can we find an unknown graph using shortest path queries between its vertices? This is a natural theoretical question from the standpoint of recovery of hidden information. This question is related to discovering the topology of Internet networks which is a crucial step for building accurate network models and designing efficient algorithms for Internet applications.

In this talk I will introduce the problems of graph reconstruction and verification via oracles. I will investigate randomized algorithms based on a Voronoi cell decomposition. I will also analyze greedy algorithms and prove that they are near-optimal.

The talk is based on joint work with Claire Mathieu and Sampath Kannan.

Sustaining the Energy Transition: A Role for Computer Science and Complex Networks

Marco Aiello
Rijksuniversiteit Groningen
SWS Colloquium
03 Nov 2016, 10:30 am - 12:30 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbr├╝cken building E1 5, room 029
The energy sector is in the midst of an exciting transition. Driven by new generation technologies and by infrastructure digitalization the traditional way of transmitting distributing and using energy is transforming a centralized hierarchical system into a multi-directional open infrastructure. While the vision of Intelligent Energy Networks is appealing and desirable---especially from a sustainability perspective---a number of major challenges remain to be tackled. The loss of centralized control the intermittent nature of renewable energy sources and the scale of the future digital energy systems are novel situations for power systems infrastructures and consumers that pose reliability and availability threats.

In this talk I show examples of how Computer Science techniques are having and will have an important role in future energy systems. I focus on electricity as energy vector and techniques from Service-Oriented Computing and AI Planning. I also present Complex Network theory as a design tool for energy distribution systems. To make things concrete I will review almost ten years of personal research that include making office buildings energy efficient homes smarter and futuristic models for the evolution of power distribution grids to accommodate for multi-directional energy flows with distributed generation and local control.

A Promising Semantics for Relaxed-Memory Concurrency

Viktor Vafeiadis
Max Planck Institute for Software Systems
Joint Lecture Series
02 Nov 2016, 12:15 pm - 2:15 pm
Saarbr├╝cken building E1 5, room 002
Despite decades of research in shared-memory concurrency and in the semantics of programming languages there is still no adequate answer to the following fundamental question: What is the right semantics for concurrent shared-memory programs written in higher-level languages? The difficulty in answering this question is that any solution will have to balance the conflicting desiderata of programmers compilers and hardware. Previous attempts either allow spurious "out-of-thin-air" program behaviours or incur a non-negligible implementation cost. Nevertheless we show that the question can be answered elegantly with the novel notion of promises: a thread may promise to execute a write in the future thus enabling other threads to read from that write out of order. This idea alone suffices to explain most of the relaxed consistency behaviours observed as a result of compiler and hardware optimisations while ruling out "out-of-thin-air" behaviours.

Heap-based reasoning about asynchronous concurrency

Johannes Kloos
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
27 Oct 2016, 6:00 pm - 8:00 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbr├╝cken building E1 5, room 105
Asynchronous concurrency is a wide-spread way of writing programs that deal with many short tasks. It is the programming model behind event-driven concurrency as exemplified by GUI applications where the tasks correspond to event handlers web applications based around JavaScript the implementation of web browsers but also of server-side software or operating systems. It provides a middle-ground between sequential programming and multi-threading giving the benefits of concurrency while being easier to program than multi-threading. While there are languages and libraries that make the construction of asynchronous programs relatively simple there is little support for asynchronous program analysis. Existing working is mostly focused on model checking or performing specific targeted analyses. The model checking techniques in particular have turned out to be inefficient and completely ignore the heap. In this thesis proposal I will address the question of how we can reason about asynchronous programs and how I want to use this to optimize asynchronous programs. I will address three main questions: 1.      How can we reason about asynchronous programs without ignoring the heap? 2.      How can we use such reasoning techniques to optimize programs involving asynchronous behavior? 3.      How can we scale reasoning and optimization to apply to real-world software? In the proposal I will describe answers to all three questions. The unifying idea behind all of these results is the use of a appropriate model of global state (usually the heap) and a promise-based model of asynchronous concurrency.

To address the first question I will describe some prior work (ALST ECOOP 2015) where we extended the OCaml type system to reason about asynchronous concurrency. To address the second and third question I will describe an ongoing project (JSDefer) about optimization web pages by loading JavaScript asynchronously and a planned project about an automated optimization step for OCaml programs to use change synchronous to asynchronous I/O operations safely.

Polynomial Identity Testing

Markus Bl├Ąser
Fachrichtung Informatik - Saarbr├╝cken
Joint Lecture Series
05 Oct 2016, 12:15 pm - 3:15 pm
Saarbr├╝cken building E1 5, room 002
Does randomness help to speed up computation? This is a fundamental question in computer science. Inspired by early successes like randomized primality tests a significant amount of researcher believed for a long time that randomness indeed helps speeding up computations. In 1997 Impagliazzo and Wigderson proved that the existence of certain hard to compute functions implies that randomness can only give a speedup by a polynomial factor. Shortly after this Agrawal Kayal and Saxena were able to give a deterministic polynomial time primality test.

We still do not know whether for every problem having a randomized polynomial time algorithm there is also a deterministic polynomial time one. In this talk we will look at the flagship problem that has a randomized polynomial time algorithm but for which we do not know whether a deterministic one exists: the so-called polynomial identity testing problem. We are given an arithmetic circuit computing some polynomial and we want to answer whether this polynomial is identically zero. While there is an easy randomized algorithm we do not know whether a deterministic polynomial time algorithm exists and it can be shown that the existence of such an algorithm implies lower bounds for circuit size a partial converse to the result by Impagliazzo and Wigderson. In this talk we will introduce the polynomial identity testing problem and its variants and give an overview about deterministic algorithms for some special cases.

Multi-Authority ABE: Constructions and Applications

Beverly Li
Hunan University
SWS Colloquium
23 Sep 2016, 11:00 am - 2:00 pm
Saarbr├╝cken building E1 5, room 029
Attribute-based Encryption(ABE) is a form of asymmetric cryptography that allows encryption over labels named "attributes". In an ABE scheme an "authority" generates public parameters and secrets and assigns attributes (and associated secrets) to users. Data can be encrypted using formulas over attributes; users can decrypt if they have attribute secrets that satisfy the encryption formula.

In this talk I will discuss an extension to ABE that allows encryption over attributes provided by multiple authorities. Such a scheme enables secure data sharing between otherwise distrusting organizations. I will discuss example scenarios where multi-authority ABE is useful and describe one new construction of multi-authority ABE scheme named DMA.

In DMA a data owner is a first class principal: users in the system get attributes in cooperation with the data owner and various authorities. Compared to previous work DMA does not require a global identity for users or require the multiple authorities to trust a single central authority. DMA is also immune to collusion attacks mounted by users and authorities.

Useful but ugly games

Ruediger Ehlers
Univ. of Bremen
SWS Colloquium
13 Sep 2016, 10:30 am - 1:30 pm
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbr├╝cken building E1 5, room 029
The controller synthesis problem for CPS is often reduced to solving omega-regular games with an optional optimization criterion. The criteria commonly used in the literature on omega-regular games are however frequently unsuitable for obtaining high-quality CPS controllers as they are unable to capture many if not most real-world optimization objectives. We survey a few such cases and show that such problems can be overcome with more sophisticated optimization criteria. The synthesis problem for them gives rise to ugly games i.e. games that have complicated definitions but relatively simple solutions.

Towards Privacy-Compliant Mobile Computing

Paarijaat Aditya
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
06 Sep 2016, 5:30 pm - 7:30 pm
Saarbr├╝cken building E1 5, room 005
simultaneous videocast to Kaiserslautern building G26, room 112
Sophisticated mobile computing sensing and recording devices like smartphones Life-logging cameras and Google Glass are carried by their users virtually around the clock. While these devices enable transformative new applications and services they also introduce entirely new threats to users' privacy because they can capture a complete record of the user's location online and offline activities and social encounters including an audiovisual record. Such a record of users' personal information is highly sensitive and is subject to numerous privacy risks. In this thesis we address two specific contexts in which this problem arises which are: 1) risks to users' personal information introduced by a popular class of apps called mobile social apps 2) privacy risks due to ubiquitous digital capture where bystanders may inadvertently (and/or against their wishes) be captured in photos and video recorded by other nearby users. Both the solutions aim at putting the user back in control of what personal information is being collected and shared while enabling innovative new applications.

Dynamics of Crowdlearning and Value of Knowledge

Utkarsh Upadhyay
Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
02 Sep 2016, 11:00 am - 1:00 pm
Kaiserslautern building G26, room 517
Learning from the crowd has become increasingly popular in the Web and social media. There is a wide variety of crowdlearning sites in which on the one hand users learn from the knowledge that other users contribute to the site and on the other hand knowledge is reviewed and curated by the same users using assessment measures such as upvotes or likes.

In this talk I present a probabilistic modeling framework of crowdlearning which uncovers the evolution of a user's expertise over time by leveraging other users' assessments of her own contributions. The model allows for both off-site and on-site learning and captures forgetting of knowledge. We then develop a scalable estimation method to fit the model parameters from millions of recorded learning and contributing events.

We show the effectiveness of our model by tracing activity of ~25 thousand users in Stack Overflow over a 4.5 year period. We find that answers with high knowledge value are rare. Newbies and experts tend to acquire less knowledge than users in the middle range. Prolific learners tend to be also proficient contributors that post answers with high knowledge value.

Domain Specific Languages for Verified Software

Damien Zufferey
MIT
SWS Colloquium
15 Aug 2016, 10:30 am - 2:00 pm
Kaiserslautern building G26, room 111
In this talk I will show how we can harness the synergy between programming languages and verification methods to help programmers build reliable software prove complex properties about them and scale verification to industrial projects. First I will describe P a domain-specific language to write asynchronous event driven code. P isolates the control-structure or protocol from the data-processing. This allows us not only to generate efficient code but also to test it using model checking techniques. P was used to implement and verify the core of the USB device driver stack that ships with Microsoft Windows 8 and later versions. The language abstractions and verification helped building a driver which is both reliable and fast. Then I will introduce PSync a domain specific language for fault-tolerant distributed algorithms that simplifies the implementation of these algorithms enables automated formal verification and can be executed efficiently. Fault-tolerant algorithms are notoriously difficult to implement correctly due to asynchronous communication and faults. PSync provides an high-level abstraction by viewing an asynchronous faulty system as synchronous one with an adversarial environment that simulates faults. We have implemented in PSync several important fault-tolerant distributed algorithms and we compare the implementation of consensus algorithms in PSync against implementations in other languages.

Cache-Persistence-Aware Response-Time Analysis for Fixed-Priority Preemptive Systems

Geoffrey Nelissen
CISTER, Porto
SWS Colloquium
10 Aug 2016, 10:30 am - 2:00 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbr├╝cken building E1 5, room 029
The existing gap between the processor and main memory operating speeds motivated the introduction of intermediate cache memories to accelerate the average access time to instructions and data accessed by programs running on the processor. The introduction of cache memories in modern computing platforms is the cause of important variations in the execution time of each task depending on whether the instruction and data it requires are already loaded in the cache or not. Many works have focused on analyzing the impact of preemptions on the worst-case execution time (WCET) and worst-case response time (WCRT) of tasks in preemptive systems. Indeed the preempted tasks may suffer additional cache misses if its memory blocks are evicted from the cache during the execution of preempting tasks. These evictions cause extra accesses to the main memory which result in additional delays in the task execution. This extra cost is usually referred to as cache-related preemption delays (CRPDs).

Several approaches use information about the tasks' memory access patterns to bound and incorporate preemption costs into the WCRT analysis. These approaches all result in pessimistic WCRT bounds due to the fact that they do not consider the variation in memory demand for successive instances of a same task. They usually assume that the useful cache content for the task is completely erased between two of its executions. However in actual systems successive instances of a task may re-use most of the data and instructions that were already loaded in the cache during previous executions. During this talk we will discuss the concept of persistent cache blocks from a task WCRT perspective and will present how it can be used to reduce the pessimism of the WCRT analysis for fixed priority preemptive systems. Then we will introduce techniques exploiting this notion of cache persistence to pre-configure systems so as to improve their runtime behavior.

An overview of MSR-I

Chandu Thekkath
Microsoft Research India
SWS Distinguished Lecture Series
05 Aug 2016, 10:30 am - 2:00 pm
Saarbr├╝cken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 111
This talk will briefly cover the overall research agenda of the MSR Lab in Bangalore. We work in many broad areas that of CS including Algorithms Crypto Systems ML and ICT4D among others.  The talk will cover three ongoing projects to give you a sense of the breadth of our work: The Trusted Cloud Green Spaces and 99DOTS. The goal of the Trusted Cloud project is to explore the challenges of keeping client data stored in the Cloud secure without trusting the Cloud operator and involves research in the disciplines of computer security programming languages and verification and hardware.

The Green Spaces project attempts to understand the implications of using TV spectrum to provide ubiquitous internet access in countries like India or Brazil where unlike the US there is plenty of unused spectrum that can be tapped. This project involves both questions in CS research as well as policy issues at the national level on spectrum allocation.

The 99DOTS project address the problem that arises when patients do not adhere to medications as prescribed by the doctors. Such non-adherence has severe health consequences to large population of patients in all parts of the world. 99DOTS proposes a novel solution to ensure medication adherence in a very cost way and is used by the Indian Government for Tuberculosis in all its treatment centers in the country.

Learning-Based Synthesis

Daniel Neider
UCLA
SWS Colloquium
25 Jul 2016, 10:30 am - 1:30 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbr├╝cken building E1 5, room 029
Synthesis the automatic construction of objects related to hard- and software is one of the great challenges of computer science. Although synthesis problems are impossible to solve in general learning-based approaches in which the synthesis of an object is based on learning from examples have recently been used to build elegant and extremely effective solutions for a large number of difficult problems. Such examples include automatically fixing bugs translating programs from one language into another program verification as well as the generation of high-level code from given specifications.

This talk gives an introduction to learning-based synthesis. First we develop a generic view on learning-based synthesis called abstract learning frameworks for synthesis which introduces a common terminology to compare and contrast learning-based synthesis techniques found in the literature. Then we present a learning-based program verifier which can prove the correctness of numeric programs (nearly) automatically and show how this technique can be modeled as an abstract learning framework for synthesis. During the talk we present various examples that highlight the power of the learning-based approach to synthesis.

Algorithmic fairness: a mathematical perspective

Suresh Venkatasubramanian
University of Utah
SWS Colloquium
22 Jul 2016, 2:00 pm - 5:30 pm
Saarbr├╝cken building E1 5, room 029
Machine learning has taken over our world in more ways than we realize. You might get book recommendations or an efficient route to your destination or even a winning strategy for a game of Go. But you might also be admitted to college granted a loan or hired for a job based on algorithmically enhanced decision-making. We believe machines are neutral arbiters: cold calculating entities that always make the right decision that can see patterns that our human minds can't or won't. But are they? Or is decision-making-by-algorithm a way to amplify extend and make inscrutable the biases and discrimination that is prevalent in society? To answer these questions we need to go back - all the way to the original ideas of justice and fairness in society. We also need to go forward - towards a mathematical framework for talking about justice and fairness in machine learning.

Framing Dependencies Introduced by Underground Commoditization

Damon McCoy
NYU
SWS Colloquium
21 Jul 2016, 1:30 pm - 5:00 pm
Saarbr├╝cken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
Internet crime has become increasingly dependent on the underground economy: a loose federation of specialists selling capabilities services and resources explicitly tailored to the abuse ecosystem. Through these emerging markets modern criminal entrepreneurs piece together dozens of à la carte components into entirely new criminal endeavors. In this talk I'll discuss parts of this ecosystem and show that criminal reliance on this black market introduces fragile dependencies that if disrupted undermine entire operations that as a composite appear intractable to protect against.