Staged Generic Programming

As with all efforts to simplify programming so that systems of greater complexity can be more easily created, there are issues to consider.

1. Introduction

Generic programming. The promise of generic programming is the elimination of the tedious boilerplate code used to traverse complex data structures. For example, suppose that you want to search a value v for every value of a certain type satisfying some predicate (e.g. even int values). You might start by writing code to traverse v, examining its constructors and iterating over their fields. Alternatively, you might use a generic programming library such as Scrap Your Boilerplate, and write code like the following:

listify evenp v

This snippet lists all even integers within v, whether v is a list, tree, pair, or some more complex structure. Evidently, generic programming can significantly simplify certain programming tasks. However, this simplification often comes with a severe performance cost. For example, with the SYB implementation of listify the snippet above typically executes around 20 times slower than an equivalent hand-written traversal specialised to a particular type, even if v is dense in integers. If the integers are sparsely distributed, performance can be significantly worse.

Multi-stage programming. The poor performance of functions like listify is a consequence of the same genericity that makes them appealing. How might we keep the genericity but eliminate the cost?
One approach to eliminating abstraction costs is multi-stage programming. Multi-stage programs improve efficiency using information that becomes available after a function is defined but before it is invoked. For example, the author of listify cannot possibly know the eventual types of its arguments, and so ought to make listify as general as possible. However, the caller of listify typically knows the types of the arguments before the time when the function is actually called. This type information can be used to specialise listify at the call site, taking advantage of type information to optimize the function implementation. In other words, multi-stage programming transforms an inefficient generic function such as listify into an optimising code generator. Since the overheads of SYB are so large, even naive staging that does no more than eliminate typepassing polymorphism can achieve dramatic performance increases. As we shall see, application of more sophisticated staging techniques can achieve more substantial improvements – eliminating branches, projections and function calls, statically merging values, and restructuring code to avoid repeated computation. For example, here is the code generated by the staged listify when v has type (int ˙ string list) list – that is, when v is an association list whose keys are integers and whose values are lists of strings:

let rec f l match l with


| hd     tl let (x˛y)  hd  in

if evenp x then x f tl else f tl

The code corresponds closely to what one might write by hand: a simple recursive traversal1 that tests each key x in turn, consing the key onto the result if it satisfies evenp. The listify code generator has determined that the value y can be ignored, since a string list cannot contain an int.

1.1   Outline and Contributions

The central contribution of this work is the principled application of staging techniques to transform a generic programming library into a typed optimising code generator. Several of the techniques used have not previously appeared in the literature but, since they deal with the core elements of functional programming – algebraic data, higher-order functions, recursion, etc. – we anticipate that they will be applicable to a wide class of programs.

The next two sections recapitulate a naive staging of SYB, starting with a port of SYB to OCaml (Section 2), which is then staged to turn generic functions into code generators, largely eliminating generic dispatch overhead (Section 3). The remainder of the paper presents new contributions:

  • Building on the staging transformation of Section 3, we show how to further improve perfor-mance with an armoury of principled staging techniques: recursion analysis, let‚rec insertion and inlining (Section 4.1), partially-static data (Section 4.2), reification and reflection (Sec-tion 4.3), case lifting (Section 4.4), branch pruning (Section 4.5), and fixed-point elimination (Section 4.6).

Several of the techniques in Section 4 have roots in type-directed partial evaluation, normal-ization by evaluation, and other venerable approaches to program transformation. We show here that these techniques find happy expression in a multi-stage programming language with delimited control and advanced forms of polymorphism.

  • The simple listify example from the introduction guides the development, as we consider techniques that improve the generated code for various types of data. However, the techniques developed for listify also apply directly to other generic functions (Section 5).
  • An appealing property of quotation-based multistage programming is that optimizations are typically predictable – constructs that do not appear in quotations in the generated program are certain to be absent from the generated program. For this reason, a qualitative evaluation that focuses on the nature of the generated code may be more informative than a quantitative assessment of program running time. Nevertheless, we include a quantitative evaluation in Section 6 to complement the analysis of code generation, and show that the systematic staging of SYB generates code that runs 20-30× faster than the unstaged version, and that equals or outperforms handwritten code.
  • Finally, this work serves to demonstrate how staged programming languages support extending libraries with the ability to optimize their own code at call sites. The staged SYB dramatically outperforms the original, but requires no compiler optimizations, no external tools, and no additional sophistication on the part of the user of the library.

Source: Jeremy Yallop,
Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License.