Events 2005

Programming Ad-hoc Networks of Mobile Devices

Ulrich Kremer
Rutgers University
SWS Colloquium
14 Nov 2005, 11:00 am
Saarbrücken building E1 5, room 024
Ad-hoc networks of mobile devices such as smart phones and PDAs represent a new and exciting distributed system architecture. Building distributed applications on such an architecture poses new design challenges in programming models languages compilers and runtime systems. This talk will introduce SpatialViews a high-level language designed for programming mobile devices connected through a wireless ad-hoc network. SpatialViews allows specification of virtual networks with nodes providing desired services and residing in interesting spaces. These nodes are discovered dynamically with user-specified time constraints and quality of result (QoR). The programming model supports ``best-effort'' semantics i.e. different executions of the same program may result in ``correct'' answers of different quality. It is the responsibility of the compiler and runtime system to produce a high-quality answer for the particular network and resource conditions encountered during program execution.

Example applications will be used to illustrate the different features of the SpatialViews language and to demonstrate the expressiveness of the language and the efficiency of the compiler generated code. Sample applications include sensor network applications that collect and aggregate sensor data within the network applications that use dynamic service installation and computation offloading and augmented-reality gaming. The efficiency of the compiler generated code is verified through simulation and physical measurements. The reported results show that SpatialViews is an expressive and effective language for ad-hoc networks. In addition compiler optimizations can significantly improve response times and energy consumption. More information about the language compiler and runtime system including a distribution of our prototype system can be found at http://www.cs.rutgers.edu/spatialviews .

Coccinelle: A Language-Based Approach to Managing the Collateral Evolution of Linux Device Drivers

Gilles Muller
Ecole des Mines de Nantes
SWS Colloquium
14 Sep 2005, 2:30 pm
Saarbrücken building E1 5, room 24
The Linux operating system is undergoing continual evolution. Evolution in the kernel and generic driver modules often triggers the need for corresponding evolutions in specific device drivers. Such collateral evolutions are tedious because of the large number of device drivers and error-prone because of the complexity of the code modifications involved. We propose an automatic tool Coccinelle to aid in the driver evolution process. In this talk we examine some recent evolutions in Linux and the collateral evolutions they trigger and assess the corresponding requirements on Coccinelle.

DREAM: a Component Framework for the Construction of Resource-Aware,Dynamically Configurable Communication Middleware

Vivien Quema
Institut National Polytechnique de Grenoble, INRIA Rhône-Alpes, France
SWS Colloquium
08 Sep 2005, 11:00 am - 12:30 pm
Saarbrücken building 46.1 - MPII, room 024
In this talk we present the work we are conducting at INRIA Rhône-Alpes on the design of component-based framework for the construction of autonomous systems.

Modern distributed computing systems are becoming increasingly complex. A major trend currently is to build autonomous systems i.e. systems that reconfigure themselves upon occurrence of events such as software and hardware faults performance degradation etc. Building autonomous systems requires both a software technology allowing the development of administrable systems and the ability to build control loops in charge of regulating and optimizing the behavior of the managed system.

In this talk we will mainly focus on the first requirement i.e. providing a software technology for the development of administrable systems. We argue that better configurability can be reached through the use of component-based software frameworks. In particular we present DREAM a software framework for the construction of message-oriented middleware (MOMs). Several MOMs have been developed in the past ten years. The research work has primarily focused on the support of various non functional properties like message ordering reliability security scalability etc. Less emphasis has been placed on MOM configurability. From the functional point of view existing MOMs implement a fixed programming interface (API) that provides a fixed subset of asynchronous communication models (publish/subscribe event/reaction message queues etc.). From the non-functional point of view existing MOMs often provide the same non-functional properties for all message exchanges which reduces their performance. To overcome these limitations we have developed DREAM (Dynamic REflective Asynchronous Middleware) a component framework for the construction of dynamically reconfigurable communication systems. The idea is to build a middleware as an assembly of interacting components which can be statically or dynamically configured to meet different design requirements or environment constraints. DREAM provides a component library and a set of tools to build configure and deploy middleware implementing various communication paradigms. DREAM defines abstractions and provides tools for controlling the use of resources (i.e. messages and activities) within the middleware. Moreover it builds upon the Fractal component model which provides support for hierarchical and dynamic composition. DREAM has been successfully used for building various forms of communication middleware: publish-subscribe (JMS) total order group communication protocols probabilistic broadcast asynchronous RPC etc.

Implementing Declarative Overlays

Timothy Roscoe
Intel Research Berkeley
SWS Colloquium
24 Aug 2005, 3:00 pm
Saarbrücken building E1 4, room 24
Overlay networks are used today in a variety of distributed systems ranging from file-sharing and storage systems to communication infrastructures. Overlays of various kinds have recently received considerable attention in the networked systems research community partly due to the availability of the PlanetLab planetary-scale application platform. However a broader historical perspective is that overlay functionality has implicitly long been a significant component of wide-area distributed systems.

Despite this designing building and adapting these overlays to an intended application and the target environment is a difficult and time consuming process.

To ease the development and the deployment of such overlay networks my research group at Intel Berkeley in conjunction with the University of California at Berkeley is building P2 a system which uses a declarative logic language to express the overlay networks in a highly compact and reusable form. P2 can express a Narada-style mesh network in 13 rules and the Chord structured overlay in only 35 rules. P2 directly parses and executes such specifications using a dataflow architecture to construct and maintain the overlay networks. I'll describe the P2 approach how our implementation works and give some experimental results showing that the performance and robustness of P2 overlays is acceptable.

Latest news about lock-free object implementations

Petr Kouznetsov
EPFL
SWS Colloquium
08 Jul 2005, 10:00 am
Saarbrücken building 46.1 - MPII, room 024
Lock-free implementations of shared data objects do not rely on any form of mutual exclusion and thereby allow processes to overcome adverse operating system affects. Wait-free implementations provide the strongest form of lock-freedom and guarantee that every process can complete its operation regardless of the execution speeds of other processes. They are in this sense very appealing but turn out to be impossible or very expensive to achieve in many practical settings. Recently researchers suggested a weaker liveness property called obstruction-freedom that guarantees progress only when there is no contention which is argued to be the most common case in practice. However the notion of contention was very widely interpreted and as a result the limitations of implementations ensuring only these weaker properties remained unclear.

We formally define an adequate measure of contention which we call step contention and present a generic obstruction-free implementation that ensures progress for step contention-free operations. Our implementation is linear in time and space with respect to the number of concurrent processes. We show that these complexities are asymptotically optimal and hence generic obstruction-free implementations are inherently expensive.

Measurement-driven Modeling and Design of Internet-scale Systems

Krishna Gummadi
University of Washington
SWS Colloquium
30 May 2005, 5:00 pm
Saarbrücken building 46.1 - MPII, room 024
The Internet is huge complex and rapidly evolving. Understanding how today's Internet-scale systems work is challenging but crucial when designing the networks and applications of tomorrow. In this talk I will describe how I have used a combination of measurement modeling and analysis to understand two Internet-scale systems: (1) peer-to-peer (P2P) file-sharing systems and their workloads and (2) indirection routing systems that recover from Internet path failures.

In part because of the rise in popularity of P2P systems multimedia workloads have become the dominant source of Internet traffic. Our measurements show that multimedia workloads are substantially different from traditional Web workloads. Based on an analysis of a 6-month long trace of the Kazaa P2P system I will propose a new model for multimedia workloads and will use it to explain how a few simple fundamental factors drive them.

In the second part of my talk I will focus on understanding Internet path failures and indirection-based recovery schemes. I will first characterize the frequency and location of Internet path failures that occur in practice. Using insights drawn from our measurements I will show how a simple stateless and scalable scheme called "one-hop source routing" achieves close to the maximum possible recovery attainable by any indirection routing scheme. I will also relate experiences we gained from implementing and deploying one-hop source routing on PlanetLab.

OpenDHT: A Public DHT Service

Sean Rhea
UC Berkley
SWS Colloquium
11 Apr 2005, 4:00 pm
Saarbrücken building 46.1 - MPII, room 024
Large-scale distributed systems are hard to deploy and distributed hash tables (DHTs) are no exception. To lower the barriers facing DHT-based applications we have created a public DHT service called OpenDHT. Designing a DHT that can be widely shared both among mutually-untrusting clients and among a variety of applications poses two distinct challenges. First there must be adequate control over storage allocation so that greedy or malicious clients do not use more than their fair share. Second the interface to the DHT should make it easy to write simple clients yet be sufficiently general to meet a broad spectrum of application requirements. In this talk I will describe our solutions to these design challenges. I'll also report on our early deployment experiences with OpenDHT and describe the variety of applications already using the system.

A Reboot-based Approach to High Availability

George Candea
Stanford University
SWS Colloquium
07 Apr 2005, 5:00 pm
Saarbrücken building 46.1 - MPII, room 024
Application-level software failures are a dominant cause of outages in large-scale software systems such as e-commerce banking or Internet services. The exact root cause of these failures is often unknown and the only cure is to reboot. Unfortunately rebooting can be expensive leading to nontrivial service disruption or downtime even when clusters and failover are employed.

In this talk I will describe the "crash-only design " a way to build reboot-friendly systems. I will also present the "microreboot " a technique for surgically recovering faulty application components without disturbing the rest. I will argue quantitatively that recovery-oriented techniques complement bug-reduction efforts and provide significant improvements in software dependability. We applied the crash-only design and microreboot technique to a satellite ground station and an Internet auction system. Without fixing any bugs microrebooting recovered most of the same failures as process restarts but did so more than an order of magnitude faster and with an order of magnitude savings in lost work.

Simple cheap recovery engenders a new way of thinking about failure management. First we can prophylactically microreboot to rejuvenate a software system by parts; this averts failures induced by software aging without ever having to bring the system down. Second we can mask failure and recovery from end users through transparent call-level retries turning failures into human-tolerable sub-second latency blips. Finally having made recovery very cheap it makes sense to microreboot at the slightest hint of failure -- if the microreboot is indeed necessary we speed up recovery; if not the impact is negligible. As a result we productively employed failure detection based on statistical learning which reduces false negatives at the cost of more frequent false positives. We also closed the monitor-diagnose-recover loop and built an autonomously recovering Internet service exhibiting orders of magnitude higher availability than previously possible.