Yan Chen

Max Planck Institute for Software Systems
Paul-Ehrlich-Strasse Building G 26
67663 Kaiserslautern, Germany
Email: chenyan@mpi-sws.org
Tel: +49 (631) 9303-9614

Welcome


I am a Ph.D. student at MPI-SWS, advised by Umut Acar.

My research focuses on self-adjusting computation. Broadly, I am interested in problems arising in program analysis, dynamic/parallel computation, and applied algorithms.

Please look at my CV for more information.

Publications


All Publications Show Abstracts Hide Abstracts
  1. Implicit Self-Adjusting Computation for Purely Functional Programs

    Journal of Functional Programming (JFP). 2014
    Yan Chen, Joshua Dunfield, Matthew A. Hammer, Umut A. Acar
    Computational problems that involve dynamic data, such as physics simulations and program development environments, have been an important subject of study in programming languages. Building on this work, recent advances in self-adjusting computation have developed techniques that enable programs to respond automatically and efficiently to dynamic changes in their inputs. Self-adjusting programs have been shown to be efficient for a reasonably broad range of problems but the approach still requires an explicit programming style, where the programmer must use specific monadic types and primitives to identify, create and operate on data that can change over time.

    We describe techniques for automatically translating purely functional programs into self-adjusting programs. In this implicit approach, the programmer need only annotate the (top-level) input types of the programs to be translated. Type inference finds all other types, and a type-directed translation rewrites the source program into an explicitly self-adjusting target program. The type system is related to information-flow type systems and enjoys decidable type inference via constraint solving. We prove that the translation outputs well-typed self-adjusting programs and preserves the source program's input-output behavior, guaranteeing that translated programs respond correctly to all changes to their data. Using a cost semantics, we also prove that the translation preserves the asymptotic complexity of the source program.

    This is an expanded and corrected version of our ICFP 2011 paper.
  2. Streaming Big Data with Self-Adjusting Computation

    Data Driven Functional Programming Workshop (DDFP'13)
    with Umut A. Acar
    Many big data computations involve processing data that changes incrementally or dynamically over time. Using existing techniques, such computations quickly become impractical. For example, computing the frequency of words in the first ten thousand paragraphs of a publicly available Wikipedia data set in a streaming fashion using MapReduce can take as much as a full day. In this paper, we propose an approach based on self-adjusting computation that can dramatically improve the efficiency of such computations. As an example, we can perform the aforementioned streaming computation in just a couple of minutes.
  3. Type-Directed Automatic Incrementalization

    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'12)
    Yan Chen, Joshua Dunfield, Umut A. Acar
    Application data often changes slowly or incrementally over time. Since incremental changes to input often result in only small changes in output, it is often feasible to respond to such changes asymptotically more efficiently than by re-running the whole computation. Traditionally, realizing such asymptotic efficiency improvements requires designing problem-specific algorithms known as dynamic or incremental algorithms, which are often significantly more complicated than conventional algorithms to design, analyze, implement, and use. A long-standing open problem is to develop techniques that automatically transform conventional programs so that they correctly and efficiently respond to incremental changes.

    In this paper, we describe a significant step towards solving the problem of automatic incrementalization: a programming language and a compiler that can, given a few type annotations describing what can change over time, compile a conventional program that assumes its data to be static (unchanging over time) to an incremental program. Based on recent advances in self-adjusting computation, including a theoretical proposal for translating purely functional programs to self-adjusting programs, we develop techniques for translating conventional Standard ML programs to self-adjusting programs. By extending the Standard ML language, we design a fully featured programming language with higher-order features, a module system, and a powerful type system, and implement a compiler for this language. The resulting programming language, LML, enables translating conventional programs decorated with simple type annotations into incremental programs that can respond to changes in their data correctly and efficiently.

    We evaluate the effectiveness of our approach by considering a range of benchmarks involving lists, vectors, and matrices, as well as a ray tracer. For these benchmarks, our compiler incrementalizes existing code with only trivial amounts of annotation. The resulting programs are often asymptotically more efficient, leading to orders of magnitude speedups in practice.
  4. Self-Adjusting Stack Machines

    ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'11)
    Matthew A. Hammer, Georg Neis, Yan Chen, Umut A. Acar
    Self-adjusting computation offers a language-based approach to writing programs that automatically respond to dynamically changing data. Recent work made significant progress in developing sound semantics and associated implementations of self-adjusting computation for high-level, functional languages. These techniques, however, do not address issues that arise for low-level languages, i.e., stack-based imperative languages that lack strong type systems and automatic memory management.

    In this paper, we describe techniques for self-adjusting computation which are suitable for low-level languages. Necessarily, we take a different approach than previous work: instead of starting with a high-level language with additional primitives to support self-adjusting computation, we start with a low-level intermediate language, whose semantics is given by a stack-based abstract machine. We prove that this semantics is sound: it always updates computations in a way that is consistent with full reevaluation. We give a compiler and runtime system for the intermediate language used by our abstract machine. We present an empirical evaluation that shows that our approach is efficient in practice, and performs favorably compared to prior proposals.
  5. Implicit Self-Adjusting Computation for Purely Functional Programs

    ACM SIGPLAN International Conference on Functional Programming (ICFP'11)
    Yan Chen, Joshua Dunfield, Matthew A. Hammer, Umut A. Acar
    Computational problems that involve dynamic data, such as physics simulations and program development environments, have been an important subject of study in programming languages. Building on this work, recent advances in self-adjusting computation have developed techniques that enable programs to respond automatically and efficiently to dynamic changes in their inputs. Self-adjusting programs have been shown to be efficient for a reasonably broad range of problems but the approach still requires an explicit programming style, where the programmer must use specific monadic types and primitives to identify, create and operate on data that can change over time.

    We describe techniques for automatically translating purely functional programs into self-adjusting programs. In this implicit approach, the programmer need only annotate the (top-level) input types of the programs to be translated. Type inference finds all other types, and a type-directed translation rewrites the source program into an explicitly self-adjusting target program. The type system is related to information-flow type systems and enjoys decidable type inference via constraint solving. We prove that the translation outputs well-typed self-adjusting programs and preserves the source program's input-output behavior, guaranteeing that translated programs respond correctly to all changes to their data. Using a cost semantics, we also prove that the translation preserves the asymptotic complexity of the source program.
  6. Implementing Implicit Self-Adjusting Computation

    ACM SIGPLAN Workshop on ML (ML'11)
    Yan Chen, Joshua Dunfield, Matthew A. Hammer, Umut A. Acar
    Computational problems that involve dynamic data have been an important subject of study in programming languages. Recent advances in self-adjusting computation have developed techniques that enable programs to respond automatically and efficiently to dynamic changes in their inputs. But these techniques have required an explicit programming style, where the programmer must use specific monadic types and primitives to identify, create and operate on data that can change over time. Our paper, “Implicit Self-Adjusting Computation for Purely Functional Programs” (ICFP '11), describes the theory underlying implicit self-adjusting computation, where the programmer need only annotate the (top-level) input types of the programs to be translated. A type-directed translation rewrites the (purely functional) source program into an explicitly self-adjusting target program. The subject of this talk is a prototype implementation (an extension of MLton) and experimental results that are competitive with explicitly self-adjusting handwritten code.
  7. Formal Verification for High-Assurance Behavioral Synthesis

    International Symposium on Automated Technology for Verification and Analysis (ATVA'09)
    Sandip Ray, Kecheng Hao, Yan Chen, Fei Xie, Jin Yang
    We present a framework for certifying hardware designs generated through behavioral synthesis, by using formal verification to certify the associated synthesis transformations. We show how to decompose this certification into two components, which can be respectively handled by the complementary verification techniques, theorem proving and model checking. The approach produces a certified reference flow, composed of transformations distilled from production synthesis tools but represented as transformations on graphs with an associated formal semantics. This tool-independent abstraction disentangles our framework from the inner workings of specific synthesis tools while permitting certification of hardware designs generated from a broad class of behavioral descriptions. We provide experimental results suggesting the scalability on practical designs.
  8. CEAL: A C-Based Language for Self-Adjusting Computation

    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'09)
    Matthew A. Hammer, Umut A. Acar, Yan Chen
    Self-adjusting computation offers a language-centric approach to writing programs that can automatically respond to modifications to their data (e.g., inputs). Except for several domain-specific implementations, however, all previous implementations of self-adjusting computation assume mostly functional, higher-order languages such as Standard ML. Prior to this work, it was not known if self-adjusting computation can be made to work with low-level, imperative languages such as C without placing undue burden on the programmer.

    We describe the design and implementation of CEAL: a C-based language for self-adjusting computation. The language is fully general and extends C with a small number of primitives to enable writing self-adjusting programs in a style similar to conventional C programs. We present efficient compilation techniques for translating CEAL programs into C that can be compiled with existing C compilers using primitives supplied by a run-time library for self-adjusting computation. We implement the proposed compiler and evaluate its effectiveness. Our experiments show that CEAL is effective in practice: compiled self-adjusting programs respond to small modifications to their data by orders of magnitude faster than recomputing from scratch while slowing down a from-scratch run by a moderate constant factor. Compared to previous work, we measure significant space and time improvements.
  9. Equivalence Checking for High-Level Synthesis Flow

    M.S. thesis, Portland State University. 2008
    Yan Chen
    High-level synthesis provides a promising solution to design complicated circuits, but the lack of designers’ confidence in correctness of synthesis tools prevents the wide acceptance in engineering practice. I develop an equivalence checking algorithm within a framework for certifying high-level synthesis flow. I utilize a new formal structure clocked control data flow graph (CCDFG) to facilitate the equivalence checking process, and implement the prototype tool for verifying the equivalence between CCDFG and synthesized circuit in both bit-level and word-level. Experimental results demonstrate the effectiveness of the tools in quickly verifying, or finding bugs in the high-level synthesis flow.
  10. Optimizing Automatic Abstraction Refinement for GSTE

    ACM/IEEE Design Automation Conference (DAC'08)
    Yan Chen, Fei Xie, Jin Yang
    In this paper, we present a suite of optimizations targeting automatic abstraction refinement for Generalized Symbolic Trajectory Evaluation (GSTE). We optimize both model refinement and spec refinement supported by AutoGSTE: a counterexample-guided refinement loop for GSTE. Experiments on a family of benchmark circuits have shown that our optimizations lead to major efficiency improvements in verification involving abstraction refinement.
  11. Automatic Abstraction Refinement for Generalized Symbolic Trajectory Evaluation

    International Conference on Formal Methods in Computer-Aided Design (FMCAD'07)
    Yan Chen, Yujing He, Fei Xie, Jin Yang
    In this paper, we present AutoGSTE, a comprehensive approach to automatic abstraction refinement for generalized symbolic trajectory evaluation (GSTE). This approach addresses imprecision of GSTE’s quaternary abstraction caused by under-constrained input circuit nodes, quaternary state set unions, and existentially quantified-out symbolic variables. It follows the counterexample-guided abstraction refinement framework and features an algorithm that analyzes counterexamples (symbolic error traces) generated by GSTE to identify causes of imprecision and two complementary algorithms that automate model refinement and specification refinement according to the causes identified. AutoGSTE completely eliminates false negatives due to imprecision of quaternary abstraction. Application of AutoGSTE to benchmark circuits from small to large size has demonstrated that it can quickly converge to an abstraction upon which GSTE can either verify or falsify an assertion graph efficiently.