
Distinguished Lecture Series
Nick Benton
Microsoft Research Cambridge
The foundations both of first-order imperative languages and of higher-order functional ones have been fairly well understood since the 1970s. But good models and reasoning principles for the mix of higher-order features and dynamically allocated mutable storage have proved hard to find, although most programming languages, from machine code to C# and ML, feature just such a combination. A number of exciting logical and semantic developments, including separation, relational program logics and step-indexing, have now started to let us model, specify and verify programs in these languages. At the same time, advances in mechanized reasoning (both automatic theorem provers and proof assistants) are making it more feasible to apply our theories to realistic programs and systems. In this talk, I'll survey some of these theoretical ideas and show how we have applied them in machine-formalized proofs of compiler correctness, addressing the question of just what it means for a piece of machine code to implement correctly a phrase of a typed high-level language.
02 February 2009, 3:00 PM
Lydia E. Kavraki
Rice University
The field of computing is increasingly expected to solve complex geometric problems arising in the physical world. Such problems can be found in applications ranging from robotics planning for industrial automation to molecular modeling for studying biological processes. This talk will first describe the development of a set of algorithmic tools for robot motion planning which are often grouped under the name sampling-based algorithms. Emphasis will be placed on recent results for systems with increased physical realism and complex dynamics. The talk will then discuss how the experience gained through sampling-based methods in robotics has led to algorithms for characterizing the flexibility of biomolecules for drug discovery. A new trend in Computer Science is presented in this talk. It concerns the development of algorithmic frameworks for addressing complex high-dimensional geometric problems arising, at different scales, in the physical world. The challenges of physical computing will be highlighted as well as the opportunities to impact molecular biology and medicine.
15 December 2008, 2:00 PM
Robert Harper
Carnegie Mellon University
Program verification is difficult, and remains a task for specialists. One way to bring verification into wider use is through the design of rich type systems that guarantee useful safety properties of all programs written in that language. This shifts the burden of verification of these properties from the programmer to the language designer: one theorem about a language induces an instance of that theorem for every program written in that language. Over the last few decades very effective methods have been developed for precisely defining programming languages, the most widely used being type systems and operational semantics. Many advances have been made in language design, and many methods have been developed for proving properties of them. But even small prototype languages are rather unwieldy objects of study, involving many cases and detailed arguments that are difficult to check thoroughly by hand. Consequently, much effort is being concentrated mechanizing the metatheory of programming languages, using a variety of approaches. At Carnegie Mellon we make daily use of the Twelf implementation of the LF logical framework as an essential tool for language design. It has been applied to dozens of small examples, and, recently, to a full-scale language, Standard ML. In this talk I will describe our use of Twelf for mechanizing metatheory, and our ongoing effort to produce a fully machine-checked proof of type safety for Standard ML.
25 November 2008, 2:00 PM
Michael Ernst
MPI-SWS, MIT, and U. of Washington
This talk presents ClearView, a system that automatically creates patches for zero-day exploits: previously unknown security vulnerabilities in COTS software. The patched program survives otherwise fatal attacks, and it provides uninterrupted service both during and after attacks. ClearView first observes normal executions to learn the program's intended behavior. ClearView correlates violations of this behavior with attacks, by using an attack detector and run-time checking of the inferred behavior. ClearView converts the behavior differences into patches that may repair the behavior violation and eliminate the exploited vulnerability. Finally, ClearView dynamically evaluates each patch, distributing the most successful one. The ClearView implementation protects Windows x86 binaries. DARPA hired an external Red Team to evaluate ClearView by attacking a protected system. The Red Team had access to our design and implementation, and spent several months devising attacks that cause the Firefox browser to execute arbitrary code. ClearView prevented all of the attacks from executing malicious code. In 70% of cases, ClearView generated a patch that rendered the attack harmless while preserving application functionality.
29 October 2008, 4:00 PM
Gerard Berry
Esterel Technologies
The synchronous programming model has become a major model in industrial embedded systems design (avionics, railways, SoCs, etc.) because it reconciles concurrency and determinism. For hardware systems, synchrony naturally fits the RTL paradigm and makes it fully rigorous. For software systems,synchrony naturally fits with two classical engineering models informally based on cyclic computations: data-flow block diagrams and control-flow state machines. The initial synchronous languages were dedicated either to data-flow designs (Lustre, Signal, etc.) or to control-oriented designs (Esterel, SyncCharts, etc.). The more recent languages completely unify both point of views: Esterel v7 keeps all the temporal primitives of the initial Esterel language and adds poweful datapath definition mechanisms, while Scade 6 provides its user with an graphical merge of data-flow diagrams and hierarchical state machines. We present the principles of these recent unifications, which are non-trivial: while it is quit easy to merge the drawing styles, it is much harder to do so by respecting the semantic properties of the languages. We show how the new languages simplify application design, and how they also provide formal verification systems with vital static information. We finally discuss new proposals to extend the original synchronous model into mixed synchronous / asynchronous that try to keep non-determinism under control.
26 August 2008, 4:00 PM
Patrick Cousot
ENS, Paris
Static software analysis has known brilliant successes in the small, by proving complex program properties of programs of a few dozen or hundreds of lines, either by systematic exploration of the state space or by interactive deductive methods. To scale up is a definite problem. Very few static analyzers are able to scale up to millions of lines without sacrificing soundness and/or precision. Unsound static analysis may be useful for bug finding but is less useless in safety critical applications where the absence of bugs, at least of some categories of common bugs, should be formally verified. After recalling the basic principles of abstract interpretation including the notions of abstraction, approximation, soundness, completeness, false alarm, etc., we introduce the domain-specific static analyzer ASTRÉE (www.astree.ens.fr) for proving the absence of runtime errors in safety critical real time embedded synchronous software in the large. The talk emphasizes soundness (no runtime error is ever omitted), parametrization (the ability to refine abstractions by options and analysis directives), extensibility (the easy incorporation of new abstractions to refine the approximation), precision (few or no false alarms for programs in the considered application domain) and scalability (the analyzer scales to millions of lines). In conclusion, present-day software engineering methodology, which is based on the control of the design, coding and testing processes should evolve in the near future, to incorporate a systematic control of final software product thanks to domain-specific analyzers that scale up
18 July 2008, 2:00 PM
Rajeev Alur
University of Pennsylvania
Our ability to effectively harness the computational power of multi-processor and multi-core architectures is predicated upon advances in programming languages and tools for developing concurrent software. Recent years have seen intensive research in methods for verifying concurrent software resulting in powerful tools for finding concurrency-related bugs. Almost all of such tools are based on the assumption that the instructions of the same thread are executed according to the program order. This model is called the interleaving model in the verification community, and the sequential consistency model in the computer architecture literature. While this is a commonly accepted language-level memory model, modern multi-processors relax sequential consistency in different ways for performance reasons resulting in weaker models. The goal of our research is to develop tools for analyzing system-level concurrent software along with such details of the underlying architecture. A first step in our research has resulted in a tool called CheckFence. CheckFence analyzes C code for concurrent data types with respect to an axiomatic specification of a memory model. Using a satisfiability solver, for a given client test program, CheckFence searches for discrepancy between operation-level sequential consistency semantics for the data type and concurrent executions feasible with respect to the specified model. We have analyzed a number of benchmarks successfully using CheckFence. Our analysis has revealed a number of potential bugs, and the memory ordering fences needed to fix the bugs. We conclude by discussing research opportunities and challenges for analysis tools that can bridge the gap between the programmers' desire for simplicity of concurrency abstractions and architects' ability to expose hardware parallelism. This talk is based on joint work with Sebastian Burckhardt (Microsoft Research) and Milo Martin (Penn).
18 June 2008, 4:00 PM
Maurice Herlihy
Brown University
While transactional memory promises to ease the task of programming emerging multicore architectures, questions remain concerning how well it scales to long transactions and many cores. In this talk, we identify identify two substantial limitations in the way current proposals handle synchronization and recovery. Synchronization is typically based on read/write conflicts: two transactions conflict if they access the same object (or location) and one access is a write. Recovery is (with some exceptions) typically all-or-nothing: a transaction either commits, and installs its changes, or aborts, and discards its changes. We argue that read-write synchronization and all-or-nothing recovery are not well-suited to environments with long-lived transactions, substantial contention, or both. We describe ongoing research on how Transactional Memory can be extended to alleviate these obstacles to scalability. We describe how to exploit semantic knowledge to enhance concurrency, and how a checkpoint/continuation style of programming can support fine-grained recovery, and a novel application of Bloom filters to detect and avoid deadlocks. Joint work with Eric Koskinen.
16 May 2008, 4:00 PM
Byron Cook
Microsoft Research
Recent research advances now allow us to automatically prove termination and other liveness properties of programs. In cases where the desired property does not hold for all inputs, tools can be used to synthesize a precondition on the inputs under which the property does hold. In this talk I will describe these recent advances and discuss our efforts to apply termination analysis to the problem of proving that device drivers do not hang the Windows operating system.
05 May 2008, 11:00 AM
Thomas Reps
University of Wisconsin and GrammaTech, Inc.
What You See Is Not What You eXecute: computers do not execute source-code programs; they execute machine-code programs that are generated from source code. Not only can the WYSINWYX phenomenon create a mismatch between what a programmer intends and what is actually executed by the processor, it can cause analyses that are performed on source code -- the approach followed by most program-analysis tools -- to fail to detect bugs and security vulnerabilities. To address the WYSINWYX problem, we have developed algorithms to recover information from stripped executables about the memory-access operations that the program performs. These algorithms are used in the CodeSurfer/x86 tool to construct intermediate representations that are used for browsing, inspecting, and analyzing stripped x86 executables. Joint work with G. Balakrishnan (NEC), J. Lim (UW), and T. Teitelbaum (Cornell and GrammaTech, Inc.).
28 April 2008, 4:00 PM
Rupak Majumdar
University of California, L.A.
An asynchronous or event-driven program is one that contains procedure calls which are not directly executed from the callsite, but stored and ``dispatched" in a non-deterministic order by an external scheduler at a later point. Asynchronous programs are at the core of many server programs, embedded systems, and popular programming models for the web (Javascript and AJAX). Asynchronous programs are hard to analyze statically as both the program stack and the number of outstanding asynchronous requests may be unbounded. We describe an algorithm for precise static analysis for asynchronous programs. Our algorithm is a generalization of the interprocedural dataflow analysis framework for sequential programs. Though the problem is theoretically hard, we find that in practice our technique can efficiently analyze programs by exploiting standard optimizations of interprocedural dataflow analyses. (Joint work with Ranjit Jhala and Pierre Ganty.)
11 April 2008, 4:00 PM
Thomas Ball
Microsoft
In this talk, I'll present work by Madan Musuvathi and Shaz Qadeer of my group on radically improving how we test multi-threaded concurrent programs. Using ideas from direct model checking of executables, they have created an automated tool called CHESS that systematically explores the thread schedules of a concurrent program. CHESS incorporates several novel algorithms including iterative context bounding, which prioritizes the search to schedules with fewer context switches first, and fair stateless model checking, which guarantees that the tool will correctly handle programs that depend on fair scheduling to terminate and will find all livelocks in finite state programs. I will demonstrate a version of CHESS I have created for .NET programs and talk about our vision for making debugging of concurrent programs a first-class activity supported by all levels of the software stack. For more information about CHESS, see http://research.microsoft.com/projects/chess/.
04 April 2008, 4:00 PM
Patrice Godefroid
Microsoft Research
About 25 years ago, a new verification paradigm named "model checking" was introduced whereby checking whether a program satisfies a property is done by systematically exploring the program's state space. Since then, model checking has been much discussed in research circles and is viewed as very successful by most academic standards (high citation counts, 2008 Turing Award, etc.). Yet, model checking applied to software is still in its infancy. A first generation of model checkers for finite-state software designs, like SPIN and SMV, where engineered in the 90's. The last decade saw a second generation of software model checkers, like VeriSoft and SLAM, directly applicable to programming languages, such as C, and effective for specific application domains, namely communication protocols and device drivers, respectively. I will argue that a third generation of general-purpose software model checkers is currently emerging. Their foundation is systematic testing. They combine program analysis, testing, model checking and theorem proving. And their "killer app" is security, which makes the improbable corner cases typically found by model checking suddently relevant when they can be triggered by an attacker. This transition is happening at Microsoft where, for the first time, software model checking (albeit still in a weak form) is starting to be deployed on a larger scale for a wide range of data-driven applications. I will talk about these latest developments.
25 March 2008, 4:00 PM
Greg Morrisett
Harvard University
Simultaneous Videocast to MPI, building E1.4, room 019 I want a type system that will let me capture anything from simple type-safety properties, to contracts, and even go all the way to full correctness. Furthermore, the language should have a uniform way to treat models, specifications, types, code, and proofs so that abstraction is feasible. It turns out that Coq (more or less) has such a type system. Unfortunately, it only works on purely functional programs, where effects of any kind are forbidden, including non-termination, mutable state, continuations, I/O, etc. For the past year or two, we've been figuring out how to add these features to Coq. There are some really tricky semantic issues that we had to solve, and a number of implementation issues that we are still working on. We've now written some (mildly) interesting code that leverages these features, including thing like hash-tables and parser combinators. The interfaces capture (partial) correctness, and are designed, in the style of separation logic, with frame properties that make reasoning compositional. Joint work with Aleks Nanevski, Lars Birkedal, Rasmus Petersen, Paul Govereau, Avi Shinnar, and Ryan Wisnesky.
20 March 2008, 4:00 PM
Johannes Gehrke
Cornell University
Computer games and virtual worlds present the next frontier in digital entertainment and social interaction. An important aspect of computer games is the artificial intelligence (AI) of non-player characters. To create interesting AI in games today, we can create complex, dynamic behavior for a very small number of characters, but neither the game engines nor the style of AI programming enables intelligent behavior that scales to a very large number of non-player characters. I will talk about modeling game AI as a data management problem, providing a scalable framework for games with a huge number of non-player characters. I will also discuss applications of this idea to crowd simulations. I will conclude with scalability challenges for Massively Multiplayer Online Games and collaborative environments. This talk describes joint work with Alan Demers, Christoph Koch, and Walker White.
18 January 2008, 2:00 PM
Paul Francis
Cornell University
The large and constantly growing Internet routing table size is a longstanding problem that leads to increased convergence time, increased boot time, and costly equipment upgrades. The problem exists for both VPN and global routing tables, and there is concern that IPv4 address space exhaustion over the next few years may lead to an increasingly fragmented address space, poor aggregation, and therefore a increase in the rate of routing table size. To address these issues, the IETF is working hard on new protocols that will shrink routing tables. In this talk, we present a way to shrink routing tables, easily by an order of magnitude or more, without any new protocols. The idea behind our approach, called Virtual Aggregation, is to partition the address space into large Virtual Prefixes, each of which is delegated to a tunneled virtual network composed of a fraction of ISP routers. Virtual Aggregation can be used independently by a single ISP, or cooperatively among a group of ISPs. This talk describes how Virtual Aggregation can be configured and deployed, and gives performance results based on measurements made at a Tier-I ISP.
12 December 2007, 5:00 PM
Moshe Vardi
Rice University
One of the surprising developments in the area of program verification is how several ideas introduced by logicians in the first part of the 20th century ended up yielding at the start of the 21st century an industrial-standard property-specification language called PSL. This development was enabled by the equally unlikely transformation of the mathematical machinery of automata on infinite words, introduced in the early 1960s for second-order arithmetics, into effective algorithms for model-checking tools. This talk attempts to trace the tangled threads of this development.
20 November 2007, 2:00 PM
Andrew D. Gordon
Microsoft Research
Based on joint work with Karthikeyan Bhargavan, Microsoft Research Iman Narasamdya, University of Manchester We consider the problem of managing server farms largely automatically, in software. Automated management is gaining in importance with the widespread adoption of virtualization technologies, which allow multiple virtual machines per physical host. We consider the case where each server is service-oriented, in the sense that the services it provides, and the external services it depends upon, are explicitly described in metadata. We describe the design, implementation, and formal semantics of a library of combinators whose types record and respect server metadata. Our implementation consists of a typed functional script built with our combinators, in control of a Virtual Machine Monitor hosting a set of virtual machines. Our combinators support a range of operations including creation of virtual machines, their interconnection using typed endpoints, and the creation of intermediaries for tasks such as load balancing. Our combinators also allow provision and reconfiguration in response to events such as machine failures or spikes in demand. We describe a series of programming examples run on our implementation, based on existing server code for order processing, a typical data centre workload. To obtain a formal semantics for any script using our combinators, we provide an alternative implementation of our interface using a small concurrency library. Hence, the behaviour of the script plus our libraries can be interpreted within a statically typed process calculus. Assuming that server metadata is correct, a benefit of typing is that various server configuration errors are detected statically, rather than sometime during the execution of the script.
09 November 2007, 2:00 PM
Edward A. Lee
UC Berkeley
The most widely used concurrent software techniques, which are based on threads, monitors (or approximations to monitors), and semaphores, yield incomprehensible and untestable software. Bugs due to race conditions, timing unpredictability, and potential deadlocks can go undetected for a very long time. Unexpected interactions between even loosely coupled software components can destabilize systems. Yet increased parallelism in general-purpose computing (particularly multicore systems), multi-threaded languages such as Java and C#, increased networking in embedded computing, and a growing diversity of attached hardware requiring specialized device drivers mean that a much greater fraction of software is concurrent. Software designers are poorly equipped for this. They use threads because syntactically, threads change almost nothing. They only later discover that semantically, threads change everything. By then it is too late. Yet there is no shortage of theory; there is a mature community with sound, well-developed techniques that the mainstream largely ignores. How can we change that? In this talk, I will make a case that composition languages, which describe program structure only, can be coupled with concurrent models of computation and conventional imperative languages to form a powerful troika. Such heterogeneous combinations of languages do have a chance for acceptance, and in certain niche situations, have already achieved a measure of acceptance
29 October 2007, 2:00 PM
Jennifer Rexford
Princeton University
The Border Gateway Protocol (BGP) allows an autonomous system (AS) to apply diverse local policies for selecting routes and propagating reachability information to other domains. However, BGP permits ASes to have conflicting policies that can lead to routing instability. This talk proposes a set of guidelines for an AS to follow in setting its routing policies, without requiring coordination with other ASes. Our approach exploits the Internet's hierarchical structure and the commercial relationships between ASes to impose a partial order on the set of routes to each destination. The guidelines conform to conventional traffic-engineering practices of ISPs, and provide each AS with significant flexibility in selecting its local policies. Furthermore, the guidelines ensure route convergence even under changes in the topology and routing policies. Drawing on a formal model of BGP, we prove that following our proposed policy guidelines guarantees route convergence. We also describe how our methodology can be applied to new types of relationships between ASes, how to verify the hierarchical AS relationships, and how to realize our policy guidelines. Our approach has significant practical value since it preserves the ability of each AS to apply complex local policies without divulging its BGP configurations to others. The end of the talk briefly summarizes follow-up studies that have built on this work.
28 September 2007, 11:00 AM
Eric Brewer
UC Berkeley
Moore's Law and the wave of technologies it enabled have led to tremendous improvements in productivity and the quality of life in industrialized nations. Yet, technology has had almost no effect on the other five billion people on our planet. In this talk I argue that decreasing costs of computing and wireless networking make this the right time to spread the benefits of technology, and that the biggest missing piece is a lack of focus on the problems that matter. After covering some example applications that have shown very high impact, I present some our own results, including the use of novel low-cost telemedicine to improve the vision of real people, with over 20,000 patients examined so far. I conclude with some discussion on the role of EECS researchers in this new area.
19 September 2007, 3:00 PM
Henning Schulzrinne,
Columbia University
Simultaneous videocast: Campus of Saarland University, Building E 1.4, room 024 (MPI building) About ten years ago, the notion of ubiquitous computing first appeared, just as the first truly mobile devices became available. Ten years later, the notion of ubiquitous computing as integrating computing into the environment has not quite panned out, with the emphasis shifting to personal, mobile devices. In this talk, I will try to illustrate some of the user-focused challenges that derive from the goals of ubiquitous computing. We have also started to address some aspects of the problem with our work on service and session mobility, as well as attempts to offer core network services, such as email and web access, in partially disconnected environments.
14 November 2006, 3:00 PM
Erik Sandewall,
Department of Computer and Information Science,
Linkoeping University
In the Leonardo project we explore an alternative way of organizing the central software of computers whereby the redundant duplication of concepts and facilities should be greatly reduced. The kernel of the new system inherits traits from both operating system shells, programming language systems, discrete simulation systems, knowledge-based agent systems and others. It integrates a number of facilities that are usually considered as peripheral, such as an ontology structure and a version management facility. Facilities for communication between multiple such agent-like systems for the purpose of task-sharing and information exchange is an essential ingredient in the design. In the talk I will briefly describe the present, experimental implementation and then discuss the project from the following points of view: - Short-term and long-term opportunities for major reform of computer software architecture - Discussion of what types of software systems can with advantage be assimilated into a comprehensive architecture, with particular emphasis on human-robot dialog systems in multi-robot environments - Perspective on textual data languages, such as XML and OWL, as compared with the representation developed in Leonardo - New aspects on logics of actions and change that are obtained when working with the actual system implementation.
16 October 2006, 3:00 PM
Jean-Raymond Abrial,
Department of Computer Science,
ETH Zuerich
The concept of "design pattern" is well known in Object Oriented Technology. The main idea is to have some sort of reproducible engineering micro-design that the software designer can use in order to develop new pieces of software. In this presentation, I try to borrow such OO ideas and incorporate them within the realm of formal methods. First, I will proceed by defining (and prove) two formal patterns, where the second one happens to be a refinement of the first one. As a matter of fact, one very often encounters such patterns in the development of reactive systems, where several chains of reactions can take place between an environment and a software controller, and vice-versa. Second, the patterns are then used many times in the development of a complete reactive system where several pieces of equipment interact with a software controller. It results in a systematic construction made of six refinements. The entire system is proved completely automatically. The relationship between the formal design patterns and the formal development of the problem at hand will be shown to correspond to a certain form of refinement. A demo will be shown.
30 March 2006, 2:00 PM
Matthias Felleisen
Northeastern University, Boston
Large software systems consist of dozens and hundreds of interlocking components. Software engineers adapt components from other suppliers, create their own, and glue all of these together in one large product. In this world, it becomes critical to pinpoint flaws quickly when software fails. Then the component consumer can request a fix from the producer of the faulty component or replace the component with an equivalent component from a different producer. To get to this world, programmers must learn to use interfaces and to enrich them with contractual specifications. Programming language researchers must explore interface-oriented programming in its most radical form and must evaluate its pragmatic consequences. In this talk, I report on our first steps in this direction, presenting empirical findings, research results, research plans, and wild speculations.
27 March 2006, 2:00 PM
Anja Feldmann
TU Munich
As the Internet is a human designed system one should have full knowledge about its operations and how it is used. Yet this is not the case as it is a prime example of a complex distributed software system that is managed in a decentral manner. With regards to how the Internet is managed I will present how to extract an inter-domain topology model that captures the route diversity of the Internet. With regards to how the Internet is used and abused by its users I will present extensions to an network intrusion detection system that enable us to perform dynamic application-layer protocol analysis and show how a time machine can be used for security forensics and network trouble-shooting.
23 March 2006, 2:00 PM
Luca Cardelli
Microsoft Research
Systems Biology is a new discipline aiming to understand the behavior of biological systems as it results from the (non-trivial, "emergent") interaction of biological components. We discuss some biological networks that are characterized by simple components, but by complex interactions. The components are separately described in stochastic pi-calculus, which is a "programming language" that should scale up to description of large systems. The components are then wired together, and their interactions are studied by stochastic simulation. Subtle and unexpected behavior emerges even from simple circuits, and yet stable behavior emerges too, giving some hints about what may be critical and what may be irrelevant in the organization of biological networks.
20 March 2006, 2:00 PM
Rustan Leino
Microsoft Research
Spec# is a programming system that aims to provide programmers with a higher degree of rigor than in common languages today. The Spec# language extends the object-oriented .NET language C#, adding features like non-null types, pre- and postconditions, and object invariants. In addition to static type checking and compiler-emitted run-time checks for specifications, Spec# has a static program verifier. The program verifier translates Spec# programs into verification conditions, which are then analyzed by an automatic theorem prover. In this talk, I will give an overview of Spec#, including a demo. I will then discuss some aspects of its design in more detail.
Career opportunities:
New buildings:

Lectures:
Computing research organizations in the surrounding area: