Staged Generic Programming

As with all efforts to simplify programming so that systems of greater complexity can be more easily created, there are issues to consider.

5. Evaluation: Applicability

We conclude the technical development by evaluating the more carefully staged library with a variety of examples. This section shows the dramatic improvements to the generated code, which becomes simpler, shorter, and more obviously correct. The following section (6) shows the consequent positive impact on performance.

let rec gsize {DȷDATA} v = 1 ¸ sum (gmapQ gsize v)

Fig. 21a. gsize, unstaged

let gsize = gfixQ (fun self {DȷDATA} v

sta 1 <¸> fold_left (<¸>) zero (gmapQ self v))

Fig. 21b. gsize, staged

let rec gsize {D:DATA} v = 1 ¸ sum (gmapQ gsize v)

Fig. 21a. gsize, unstaged

let gsize = gfixQ (fun self {D:DATA} v

sta 1 <¸> fold_left (<¸>) zero (gmapQ self v))

Fig. 21b. gsize, staged

let gsize_list =
  let sl = ref dummy in let sp = ref dummy in
  let ss = ref dummy in let si = ref dummy in
  let sb = ref dummy in
  let () = si := fun y  1 ¸ sum [] in
  let () = sb := fun y  1 ¸ sum [] in
  let () = ss := fun y 
                   1 ¸ (sum (match y with
                            | Left x  [!sb x]
                            | Right y  [!si y])) in
  let () = sp ȷ= fun y 
                   1 ¸ (sum (let (x˛y) = y in [!sb x; !ss y])) in
  let () = sl ȷ= fun y 
                   1 ¸ (sum (match y with
                            | []  []
                            | h::t  [!sp h; !sl t])) in
  fun x  !sl x

Fig. 21c. gsize, naively staged: generated code

(instantiated at (int˙(int˛string) either) list)

let rec r l = match l with

let rec r l = match l with | [] 1

| hȷȷt 5 ¸ r t

Fig. 21d. gsize, carefully staged: generated code

(instantiated at (int˙(int˛string) either) list)

gshow. Figures 20a and 20b show unstaged and staged implementations of the gshow function, defined in terms of the SYB functions constructor and gmapQ, and an auxiliary function show_constructor. The changes needed for staging are minimal: besides the custom fixed point operator gfixQ, the implementations are identical.

Figures 20c and 20d show the code generated by the naive and carefully staged libraries when gshow is instantiated at argument type bool list.

Several of the optimisations developed in Section 4 improve the generated code. Recursion analysis and inlining have reduced the code to a single recursive function (Section 4.1), as is also the case in the examples that follow. Since gshow inspects v twice, case lifting (Section 4.4) avoids the multiple dynamic tests (match expressions). The strings built by gshow, like the lists built by listify, are partially-static (Section 4.2), enabling the static concatenation of strings in the generated code: the boolean constructor names appear in the same string literals as the infix cons constructor.

let gtypecount {XȷTYPEABLE} x = gcount (mkQ false (fun _ true))

Fig. 22a. gtypecount, unstaged

let gtypecount {XȷTYPEABLE} x = gcount (mkQ ff (fun _ tt))

Fig. 22b. gtypecount, staged

let rec crush u = function
 | []  u
 | h::t  crush (u ¸ h) t in
let tl = ref dummy in let te = ref dummy in(
let tp = ref dummy in let tb = ref dummy in
let ti = ref dummy in
let () = tb := fun y  crush (if false then 1 else 0) [] in
let () = ti := fun y  crush (if true then 1 else 0) [] in
let () = tp := fun p  crush (if false then 1 else 0)
                   (let (a˛b) = p in [!ti a; !tb b]) in
let () = te := fun y 
                crush (if false then 1 else 0)
                 (match y with Left x  [!ti x]
                             | Right y  [!tp y]) in
let () = tl := fun y 
                crush (if false then 1 else 0)
                 (match y with []  []
                                | h::t  [!te h; !tl t]) in
fun x  !tl x

Fig. 22c. gtypecount, naively staged: generated code

(instantiated at (int˛ int ˙ bool) either list)

let rec r l = match l with

| [] 0

| hȷȷt 1 ¸ r t

Fig. 22d. gtypecount, carefully staged: generated code

(instantiated at (int˛ int ˙ bool) either list)

gsize. Figures 21a and 21b show unstaged and staged implementations of the gsize function, which computes the size of a value as the successor of the sum of the sizes of its sub-values. Partially-static data changes the code slightly from the unstaged version: arithmetic with <¸> and zero replaces arithmetic with standard integers. The larger structure of the code is unchanged.

Figures 21c and 21d show the generated code for the naive and carefully staged libraries when gsize is instantiated with argument type (int˙(int˛string) either) list. Once again, various optimizations from the preceding pages significantly simplify the generated code .

Since int and string both have size 1, a combination of partially-static data (Section 4.2) and branch pruning (Section 4.5) reduces the computation of gsize at type (int˛string) either to «, and int˙((int˛string) either) to 5.

gtypecount. Figures 22a and 22b show staged and unstaged implementations of a function gtypecount in terms of another generic scheme, gcount. Since gtypecount is not defined recursively

let rec gdepth {Dȷ DATA} x = succ (maximum (gmapQ gdepth x))

Fig. 23a. gdepth, unstaged

let gdepth_ = gfixQ (fun self {DȷDATA} x

sta 1 <¸> fold_left max zero (gmapQ self x))

Fig. 23b. gdepth, staged

let dp = ref dummy in let di = ref dummy in
let tq = ref dummy in let dl = ref dummy in
let () = di := fun y  succ (maximum []) in
let () = dl := fun y  succ (maximum (match y with
                                     | []  []
                                     | hȷȷt  [!di h; !dl t])) in
let () = tq := fun y  succ (maximum (let (a˛b) )= y in
                                        [!dl a; !di b])) in
let () = dp := fun y  succ (maximum (let (a˛b) = y in
                                        [!tq a; !di b])) in
fun x  !dp x)

Fig. 23c. gdepth, naively staged: generated code

(instantiated at ((int list ˙ int) ˙ int) → int)

let rec r x = match x with
  | []  1
  | h::(t  1 ¸ max 1 (r t)
let f p = let (x˛y) = p in
          let (a˛b) = x in
            2 ¸ max 1 (r a)

Fig. 23d. gdepth, carefully staged: generated code

(instantiated at ((int list ˙ int) ˙ int) → int)

there is no need for custom fixed points, and so the two implementations are identical except for the use of partially-static booleans ff and tt in place of false and true.

Figures 22c and 22d show the output for gtypecount when it is instantiated to count the number of int values in a value of type (int˛ int ˙ bool) either list.

As with gsize, partially-static data (Section 4.2) and branch pruning (Section 4.5) have significantly simplified the body of the generated function (Figure 22d), so that it simply adds a known integer for each element in the list, having determined that a value of type (int˛ int ˙ bool) either always contains exactly one int.

gdepth. Finally, Figures 23a and 23b show unstaged and staged versions of the generic function gdepth, a generic function for computing the longest path from a value to one of its leaves. The implementations are similar except for the use of a custom fixed point gfixQ and partially static data (<¸>, zero, max) in the staged version.

The implementation of gdepth is similar to gsize, except that the results of traversing the subvalues are combined with max, not with addition; however, both addition and max are needed, and the partially static data form a tropical semiring.

Figures 23c and 23d show the code generated when gdepth is instantiated with argument type ((int list ˙ int) ˙ int).

As the figures show, the advanced staging techniques of Section 4 have significantly improved the output. Recursion analysis has determined that the generated function r for traversing int list values should be recursive, and that the code that follows should be non-recursive. Examining the code for r reveals a remaining opportunity for improvement: since r always returns at least 1, the call to max in the cons branch is unnecessary. However, the simple fixed-point analysis in Section 4.6 is not sufficiently powerful enough to detect the redundancy.