Recent events

Geometric Complexity Theory: An ambitious approach towards P versus NP

Christian Ikenmeyer
MPI-INF - D1
Joint Lecture Series
06 Sep 2017, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Computational complexity theory is concerned with the study of the inherent complexity of computational problems. Its flagship conjecture is the famous P versus NP conjecture which is one of the seven Millennium Problems of the Clay Mathematics Institute.

To this day several thousands of computational problems are classified as NP-complete i.e. they have a polynomial time algorithm if and only if P equals NP. The practical importance of the P vs NP conjecture is at least twofold: First of all many NP-complete problems are of high practical relevance and have to be solved every day in commercial and scientific applications. Secondly if NP turned out to be P then this would break all of today's cryptographic ciphers.

Geometric complexity theory is an ambitious program initiated in 2001 by Ketan Mulmuley and Milind Sohoni towards solving the P vs NP problem by using methods from algebraic geometry and representation theory. During the last few years there has been a significant amount of research activity related to this program. In this talk I explain some basic ideas and recent developments.

Combinatorial Constructions for Effective Testing

Filip Niksic
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
14 Aug 2017, 3:00 pm - 4:00 pm
Kaiserslautern building G26, room 113
simultaneous videocast to Saarbrücken building E1 5, room 029
Testing is the predominant approach to finding bugs and gaining assurance in correctness of concurrent and distributed systems with complex state spaces. In this thesis we use combinatorial methods to develop new techniques for systematic and randomized testing. We answer several theoretical questions about the size of test sets that achieve full coverage and propose a practical tool for testing actor-based programs.

We start with a general construction obtained using the probabilistic method from combinatorics that shows that whenever a random test covers a fixed testing goal with sufficiently high probability there is a small set of random tests that achieves full coverage with high probability. We use this construction in the context of testing distributed systems under network partition faults: our construction explains why random testing tools in this context achieve good coverage--and hence find bugs--quickly.

In the context of testing concurrent programs we introduce a notion of hitting families of schedules. It suffices to test schedules from a hitting family if the goal is to expose bugs of "small depth"--bugs that only depend on the relative ordering of a small number of specific events in the program. We study the size of optimal hitting families and give explicit constructions of small hitting families when the events in the program are partially ordered as a tree. In particular for balanced trees our constructions are polylogarithmic in the number of events.

Finally we propose a testing tool for finding bugs of small depth in actor-based programs. The tool is based on an existing randomized scheduler for shared-memory multithreaded programs. Adapting the scheduler to actor-based programs poses an interesting underlying theoretical problem.

Techniques to enforce security policies on untrusted applications

Anjo Vahldiek-Oberwagner
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
07 Aug 2017, 5:00 pm - 6:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 113
As the dependence on ever-present computer systems increases so does the potential harm in case software or hardware deviates from user expectations. Users lose data or find illicitly leaked data. To overcome such inadvertent behavior existing reference monitors fail to (1) protect the confidentiality and integrity of persistent data and (2) efficiently and robustly mediate untrusted applications. In this proposal we present two reference monitors targeting these shortcomings. We demonstrate the design implementation and evaluation of Guardat and ERIM. The policies protecting persistent data and the mechanisms for their enforcement are spread over many software components and configuration files increasing the risk of policy violation due to bugs vulnerabilities and misconfiguration. In Guardat users developers and administrators specify file protection policies declaratively concisely and separate from code and Guardat enforces these policies by mediating I/O in the storage layer. Policy enforcement relies only on the integrity of the Guardat controller and any external policy dependencies. We show experimentally that Guardat overhead is low. While Guardat enforces at the storage layer it cannot enforce policies over in-memory state of untrusted applications. In contrast to existing techniques ERIM efficiently mediates an application's execution by isolating a reference monitor in the same address space. By using Intel Memory Protection Keys in combination with static binary rewriting ERIM isolates the monitor's state from strong malicious adversaries. We propose binary rewriting rules to harden existing executable files and detail use cases in which prior art relied on less robust protection at similar performance.

Towards Use-Ability in Private Data Analytics

Reinhard Munz
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
04 Aug 2017, 1:30 pm - 2:30 pm
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 105
Private data analytics systems preferably provide required analytic accuracy to analysts and specified privacy to individuals whose data is analyzed. Devising a general system that works for a broad range of datasets and analytic scenarios has proven to be difficult. Despite the advent of differentially private systems with proven formal privacy guarantees industry still uses their inferior ad-hoc predecessors that are able to provide better analytic accuracy. Differentially private mechanisms often need to add large amounts of noise to statistical results which impairs their use-ability. In my thesis I follow two approaches to improve use-ability of private data analytics systems in general and differentially private systems in particular. First I revisit ad-hoc mechanisms and explore the possibilities of systems that do not provide Differential Privacy or only a weak version thereof. In my second approach I use the insights gained before to propose UniTraX the first differentially private analytics system that allows to analyze part of a protected dataset without affecting other parts and without having to give up on guaranteed accuracy.

Fairer and more accurate, but for whom?

Alexandra Chouldechova
CMU
SWS Colloquium
25 Jul 2017, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 005
simultaneous videocast to Kaiserslautern building G26, room 111
Complex statistical models are increasingly being used or considered for use in high-stakes decision-making pipelines in domains such as financial services health care criminal justice and human services. These models are often investigated as possible improvements over more classical tools such as regression models or human judgement. While the modeling approach may be new the practice of using some form of risk assessment to inform decisions is not. When determining whether a new model should be adopted it is essential to be able to compare the proposed model to the existing approach across a range of task-relevant accuracy and fairness metrics. In this talk I will describe a subgroup analysis approach for characterizing how models differ in terms of fairness-related quantities such as racial or gender disparities. I will also talk about an ongoing collaboration with the Allegheny County Department of Health and Human Services on developing and implementing a risk assessment tool for use in child welfare referral screening.

https://arxiv.org/abs/1707.00046

http://www.post-gazette.com/local/region/2017/04/09/Allegheny-County-using-algor ithm-to-assist-in-child-welfare-screening/stories/201701290002

Decision Making and The Value of Explanation

Kathy Strandburg
NYU Law
SWS Colloquium
07 Jul 2017, 10:00 am - 11:30 am
Saarbrücken building E1 5, room 005
Much of the policy and legal debate about algorithmic decision-making has focused on issues of accuracy and bias. Equally important however is the question of whether algorithmic decisions are understandable by human observers: whether the relationship between algorithmic inputs and outputs can be explained. Explanation has long been deemed a crucial aspect of accountability particularly in legal contexts. By requiring that powerful actors explain the bases of their decisions -- the logic goes -- we reduce the risks of error abuse and arbitrariness thus producing more socially desirable decisions. Decision-making processes employing machine learning algorithms complicate this equation. Such approaches promise to refine and improve the accuracy and efficiency of decision-making processes but the logic and rationale behind each decision often remains opaque to human understanding. Indeed at a technical level it is not clear that all algorithms can be made explainable and at a normative level it is an open question when and if the costs of making algorithms explainable outweigh the benefits. This presentation will begin to map out some of the issues that must be addressed in determining in what contexts and under what constraints machine learning approaches to governmental decision-making are appropriate.

Your photos expose your social life - Social relation recognition in 5 social domains

Qianru Sun
MPI-INF - D4
Joint Lecture Series
05 Jul 2017, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Social relations are the foundation of human social life. Developing techniques to analyze such relations in visual data such as photos bears great potential to build machines that better understand people at a social level. Social domain-based theory from social psychology is a great starting point to systematically approach social relation recognition. The theory provides a coverage of all aspects of social relations and equally is concrete and predictive about the visual attributes and behaviors defining the relations in each social domain. We proposed the first photo dataset built on this holistic conceptualization of social life that is composed of a hierarchical label space of social domains and social relations and contributed the first models to recognize such domains and relations and find superior performance for attribute based features. Beyond the encouraging performances we have some findings of interpretable features that are in accordance with the predictions from social psychology literature. We interleave visual recognition and social psychology theory that has the potential to complement the theoretical work in the area with empirical and data-driven models of social life.

Finding Fake News

Giovanni Luca Ciampaglia
Indiana University Network Science Institute
SWS Colloquium
26 Jun 2017, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
Two-thirds of all American adults access the news through social media. But social networks and social media recommendations lead to information bubbles and personalization and recommendations by maximizing the click-through rate lead to ideological polarization. Consequently rumors false news conspiracy theories and now even fake news sites are an increasingly worrisome phenomena. While media organizations (Snopes.com PolitiFact FactCheck.org et al.) have stepped up their efforts to verify news political scientists tell us that fact-checking efforts may be ineffective or even counterproductive. To address some of these challenges researchers at Indiana University are working on an open platform for the automatic tracking of both online fake news and fact-checking on social media. The goal of the platform named Hoaxy is to reconstruct the diffusion networks induced by hoaxes and their corrections as they are shared online and spread from person to person.

Discrimination in Online Advertising

Till Speicher
Max Planck Institute for Software Systems
SWS Student Defense Talks - Qualifying Exam
14 Jun 2017, 3:00 pm - 4:00 pm
Saarbrücken building E1 5, room 029
Online targeted advertising has been subject to many complaints for allowing advertisers to discriminate users belonging to sensitive groups that led for instance Facebook to disallow the use of sensitive features such as ethnic affinity in certain types of ads. In this talk I argue that such measures are not sufficient and that the problem of discrimination in targeted advertising is much more pernicious. I show how a malicious advertiser can create discriminatory ads without using sensitive features. I argue that discrimination measures should be based on the targeted population and not on the attributes used for targeting and I propose ways to mitigate discriminatory targeting.

Comprehensive and Practical Policy Compliance in Data Retrieval Systems

Eslam Elnikety
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
14 Jun 2017, 1:00 pm - 2:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 113
Data use policies govern how data retrieval systems process data items coming from many different sources each subject to its own integrity and confidentiality requirements. Ensuring compliance with these requirements despite bugs misconfigurations and operator errors in a large complex and fast evolving system is a major challenge.

In this thesis we present comprehensive and practical compliance systems to address this challenge. To be comprehensive compliance systems must be able to enforce policies specific to individual data items or to a particular client's data items the service provider's own policies and policies that capture legal requirements. To be practical compliance systems need to meet stringent requirements: runtime overhead must be low; existing applications can run with little modifications; and bugs misconfigurations compromises in application code or actions by unprivileged operators cannot violate policies.

We present the design and implementation of two comprehensive and practical compliance systems: Thoth and Shai. At a high-level data use policies are stated in a declarative language separate from application code and a small reference monitor ensures compliance with these policies. Thoth and Shai differ in enforcement techniques. Thoth tracks data flows through the system at runtime by intercepting I/O at processes' boundaries and enforces the associated policies. Shai on the other hand combines static flow analysis and light-weight runtime monitoring (sandboxes and capabilities) to ensure compliance of data flows. We demonstrate the practicality of these systems using a prototype search engine based on the popular Apache Lucene.

Towards an Approximating Compiler for Numerical Computations

Eva Darulova
Max Planck Institute for Software Systems
Joint Lecture Series
07 Jun 2017, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Computing resources are fundamentally limited and sometimes an exact solution may not even exist. Thus when implementing real-world systems approximations are inevitable as are the errors introduced by them. The magnitude of errors is problem-dependent but higher accuracy generally comes at a cost in terms of memory energy or runtime effectively creating an accuracy-efficiency tradeoff. To take advantage of this tradeoff we need to ensure that the computed results are sufficiently accurate otherwise we risk disastrously incorrect results or system failures. Unfortunately the current way of programming with approximations is mostly manual and consequently costly error prone and often produces suboptimal results.

In this talk I will present our vision and efforts so far towards an approximating compiler for numerical computations. Such a compiler would take as input exact high-level code with an accuracy specification and automatically synthesize an approximated implementation which is as efficient as possible but verifiably computes accurate enough results.

Quantifying and Reducing Polarization on Social media

Kiran Garimella
Aalto University
SWS Colloquium
10 May 2017, 9:45 am - 11:15 am
Saarbrücken building E1 5, room 005
Social media has brought a revolution on how people get exposed to information and how they are consuming news. Beyond the undoubtedly large number of advantages and capabilities brought by social-media platforms a point of criticism has been the creation of filter bubbles or echo chambers caused by social homophily as well as by algorithmic personalisation and recommendation in content delivery. In this talk I will present the methods we developed to (i) detect and quantify the existence of polarization on social media (ii) monitor the evolution of polarisation over time and finally (iii) devise methods to overcome the effects caused by increased polarization. We build on top of existing studies and ideas from social science with principles from graph theory to design algorithms which are language independent domain agnostic and scalable to large number of users.

An Effectful Way to Eliminate Addiction to Dependence

Pierre-Marie Pédrot
University of Ljubljana
SWS Colloquium
08 May 2017, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 111
We define a syntactic monadic translation of type theory called the weaning translation that allows for a large range of effects in dependent type theory such as exceptions non-termination non-determinism or writing operation. Through the light of a call-by-push-value decomposition we explain why the traditional approach fails with type dependency and justify our approach. Crucially the construction requires that the universe of algebras of the monad forms itself an algebra. The weaning translation applies to a version of the Calculus of Inductive Constructions with a restricted version of dependent elimination dubbed Baclofen Type Theory which we conjecture is the proper generic way to mix effects and dependence. This provides the first effectful version of CIC which can be implemented as a Coq plugin.

Intelligent Control Systems

Sebastian Trimpe
Max Planck Institute for Intelligent Systems
SWS Colloquium
04 May 2017, 11:00 am - 12:00 pm
Kaiserslautern building G26, room 113
simultaneous videocast to Saarbrücken building E1 5, room 029
Due to modern computer and data technology we can today collect store process and share more data than every before. This data revolution opens fundamentally new ways to think about the classical concept of feedback control as a basis for building future (artificial) intelligent systems which interact with the physical world. In this talk I will provide an overview of our recent research on intelligent control systems which leverages machine learning and modern communication networks for control. I will present algorithms that enable systems to (i) autonomously learn from data (ii) interconnect in cooperative networks and (iii) use their resources efficiently. Throughout the talk the developed algorithms and theory are highlighted with experiments on humanoid robots and a self-balancing dynamic sculpture.

Digital Knowledge: From Facts to Rules and Back

Daria Stepanova
MPI-INF - D5
Joint Lecture Series
03 May 2017, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
Knowledge Graphs (KGs) are huge collections of primarily encyclopedic facts which are automatically extracted from the Web. Prominent examples of KGs include Yago DBPedia Google Knowledge Graph. We all use KGs when posing simple queries like "capital of Saarland" to Google. Internally such queries are translated into machine readable representations which are then issued against the KG stored at the backend of the search engine. Instead of syntactically relevant Web pages the actual answer to the above query "Saarbrücken" is then output to the user as a result.

However since KGs are automatically constructed they are often inaccurate and incomplete. In this talk I will investigate how deductive and inductive reasoning services could be used to address these crucially important issues. More specifically first I will present an approach for repairing inconsistencies in hybrid logical systems that can be built on top of KGs. Second I will describe a method for inductive learning of rules with exceptions from KGs and show how these are applied for deriving missing facts.

On Rationality of Nonnegative Matrix Factorization

James Worrell
University of Oxford
SWS Distinguished Lecture Series
02 May 2017, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 002
simultaneous videocast to Kaiserslautern building G26, room 112
The nonnegative rank of a nonnegative m x n matrix M is the smallest number d such that M can be written as the product M = WH of a nonnegative m x d matrix W and a nonnegative d x n matrix H.  The notions of nonnegative rank and nonnegative matrix factorization have a wide variety of applications including bioinformatics computer vision communication complexity document clustering and recommender systems. A longstanding open problem is whether when M is a rational matrix the factors W and H in a rank decomposition M=WH can always be chosen to be rational.  In this talk we resolve this problem negatively and discuss consequences of this result for the computational complexity of computing nonnegative rank.

This is joint work with Dmitry Chistikov Stefan Kiefer Ines Marusic and Mahsa Shirmohammadi.

Securing enclaves with formal verification

Andrew Baumann
Microsoft Research, Redmond
SWS Colloquium
26 Apr 2017, 4:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 112
Moore's Law may be slowing but perhaps as a result other measures of processor complexity are only accelerating. In recent years Intel's architects have turned to an alphabet soup of instruction set extensions such as MPX SGX MPK and CET as a way to sell CPUs through new security features. SGX in particular promises powerful security: user-mode "enclaves" protected against both physical attacks and privileged software adversaries. To achieve this SGX's designers implemented an isolation mechanism approaching the complexity of an OS microkernel in the ISA using an inscrutable mix of silicon and microcode. However while CPU-based security mechanisms are harder to attack they are also harder to deploy and update and already a number of significant flaws have been identified. Worse the availability of new SGX features is dependent on the slowing deployment of new CPUs.

In this talk I'll describe an alternative approach to providing SGX-like enclaves: decoupling the underlying hardware mechanisms such as memory encryption address-space isolation and attestation from a privileged software monitor which implements enclaves. The monitor's trustworthiness is guaranteed by a machine-checked proof of both functional correctness and high-level security properties. The ultimate goal is to achieve security that is equivalent or better than SGX while decoupling enclave features from the underlying hardware.

Comprehensive deep linking for mobile apps

Oriana Riva
Microsoft Research, Redmond
SWS Colloquium
10 Apr 2017, 10:30 am
Kaiserslautern building G26, room 112
simultaneous videocast to Saarbrücken building E1 5, room 005
Web deep links are instrumental to many fundamental user experiences such as navigating to one web page from another bookmarking a page or sharing it with others. Such experiences are not possible with individual pages inside mobile apps since historically mobile apps did not have links equivalent to web deep links. Mobile deep links introduced in recent years still lack many important properties of web deep links. Unlike web links mobile deep links must be explicitly built into apps by developers cover a small number of predefined pages and are defined statically to navigate to a page for a given link but not to dynamically generate a link for a given page. Another problem with state-of-the-art deep links is that once exposed they are hard to discover thus limiting their usage in both first and third-party experiences.

In this talk I'll give an overview of two new deep linking mechanisms that address these problems. First we implemented an application library that transparently tracks data- and UI-event-dependencies of app pages and encodes the information in links to the pages; when a link is invoked the information is utilized to recreate the target page quickly and accurately. Second using static and dynamic analysis we prototyped a tool that can automatically discover links that are exposed by an app; in addition it can discover many links that are not explicitly exposed. The goal is to obtain links to every page in an app automatically and precisely.

Graph Decomposition Problems in Image Analysis

Björn Andres
D2
Joint Lecture Series
05 Apr 2017, 12:15 pm - 1:15 pm
Saarbrücken building E1 5, room 002
A large part of image analysis is about breaking things into pieces. Decompositions of a graph are a mathematical abstraction of the possible outcomes. Breaking things (like an image itself) into pieces in a controlled way is hard. Optimization problems whose feasible solutions define decompositions of a graph are a mathematical abstraction of this task. One example is the correlation clustering problem whose feasible solutions relate one-to-one to the decompositions of a graph and whose objective function puts a cost or reward on neighboring nodes ending up in distinct components. This talk shows work by the Combinatorial Image Analysis research group that applies this problem and proposed generalizations to diverse image analysis tasks. It sketches the algorithms introduced by the group for finding feasible solutions for large instances in practice solutions that are often superior in the metrics of application-specific benchmarks. Finally it sketches the algorithm introduced by the group for finding lower bounds and points to new findings and open problems of polyhedral geometry in this context.

Local Reasoning for Concurrency, Distribution and Web Programming

Azalea Raad
Imperial College
SWS Colloquium
03 Apr 2017, 10:00 am - 11:00 am
Kaiserslautern building G26, room 111
simultaneous videocast to Saarbrücken building E1 5, room 029
In this talk I will present my research in developing local reasoning techniques in both concurrent and sequential settings.

On the concurrency side I'll present my work on the program logic of CoLoSL (Concurrent Local Subjective Logic) and its application to various fine-grained concurrent algorithms. A key difficulty in verifying concurrent programs is reasoning compositionally about each thread in isolation. CoLoSL is the first program logic to introduce the general composition and framing of interference relations (describing how shared resources may be manipulated by each thread) in the spirit of resource composition and framing in separation logic. This in turn enables local reasoning and allows for more concise specifications and proofs.

I will then present preliminary work on extending CoLoSL to reason about distributed database applications running under the snapshot isolation (SI) consistency model. SI is a commonly used consistency model for transaction processing implemented by most distributed databases. The existing work focuses on the SI semantics and verification techniques for client-side applications remain unexplored. To fill this gap I look into extending CoLoSL towards a program logic for client-side reasoning under SI.

On the sequential side I'll briefly discuss my work on specification and verification of web programs. My research in this area include: a compositional specification of the DOM (Document Object Model) library in separation logic; integrating this DOM specification with the JaVerT (JavaScript Verification Toolchain) framework for automated DOM/JavaScript client verification; as well as ongoing work towards extending JaVerT to reason about higher-order JavaScript programs.