1. DIRECTOR-STRING COMBINATORS I have explored one of my "avenues for future work", specifically the use of Stoye's director string combinators [4] instead of Turner's combinators [1]. I outline the idea below, and then present new runtimes for HHI (the Haskell Haskell Interpreter) using Stoye's combinators. Director strings for combinatory logic were invented independently by Dijkstra [2] and Kennaway [3]. The main idea is to annotate the interior nodes of an applicative expression with strings of combinators, rather than introduce combinators into the expression by additional applicative structure -- i.e. new constants and application nodes. The result is (perhaps arguably) a clearer and more systematic compilation scheme from lambda to combinator expressions. In particular, the director string abstraction algorithm does not attempt to abstract a variable from an expression in which that variable does not occur. Turner's algorithm uses a seperate optimiser to achieve a similar effect. Once an expression has been annotated with director strings, it is straightforward to convert it to a Turner-combinator expression (though not necessarily the same expression that Turner's algorithm would produce). However, Stoye [4] spots a more efficient alternative. Instead of mapping *each combinator* in the director string onto one of Turner's combinators, the *whole string* is mapped to one big combinator -- a so-called D combinator. D combinators essentially merge various combinations of Turner combinators, achieving more reduction in a single step. But there is no bound on the lengths of director strings, hence there are infinitely many D combinators. Stoye's solution is to split director strings into length-bounded blocks, and to introduce a set of D' combinators capabable of gluing D and D' together, just as S', B' and C' glue Turner combinators together. By simply altering the block bound, the size of the combinator set and the granularity of the reduction step can be varied. Stoye's combinators have been implemented in HHI with negligable invention on my part. Still, I have two observations. First, the compilation rule abstracting v from e v is e if v is not in e in line 8 of Dijkstra's algorithm saves quite a few reductions during evaluation. Dijkstra goes on to simplify his algorithm, writing "line 8 in the original version is no more than saving symbols at the price of symmetry, and that is at this level of discussion a doubtful procedure." (!) My second (obvious) observation is that many leaves of a combinator expression are I constants. The identity function may seem harmless enough, but trees have many leaves. Therefore, for every D combinator, I propose three additional combinators to cover the cases where the left subtree of the application is I, the right subtree is I, and both subtrees are I. This also gives a worthwhile speed-up. Now for the comparison. (Size of combinator set in brackets.) PROGRAM TURNER (8) STOYE (1701) Fib 1.25s 1.13s Queens 4.20s 2.28s Sudoku 1.47s 0.94s PermSort 4.23s 2.13s SumPuz 3.82s 2.84s OrdList 7.10s 5.14s Simplifier 7.00s 4.68s (See the HHI darcs repo for implementation details.) Note: for small numbers of combinators (<20), Turner's algorithm is superior as it exploits a number of algebraic laws that reduce expression size. A possible extension suggested by Stoye is to introduce director-string combinators for n-ary applications, for some n > 2. For example, instead of representing \x y -> a (b y) c (d x) as D{01}{10} (D{10} (D{01} a b) c) d (where {01} means pass right and {10} pass left.) it could be written D{0001}{0100} a b c d (where {0001} means pass to 4th child and {0100} pass to 2nd child.) 2. COMBINATORS FOR JANSEN-ENCODED DATA CONSTRUCTORS To simplify functional language evaluation, Jansen proposes to encode data constructors as functions, and case expressions as function applications [5]. For example, the constructors of the algebraic data type for lists, data List a = Nil | Cons a (List a) would be encoded Nil n c = n Cons x xs n c = c x xs And the standard definition of map, map f Nil = Nil map f (Cons x xs) = Cons (f x) (map f xs) would be rewritten map f v = v Nil (\x xs -> Cons (f x) (map f xs)) which is no more than lambda calculus with constants. The Haskell Haskell Interpreter (HHI) is supposedly a simple yet efficient evaluator for such lambda expressions, based on fixed-set combinator reduction. HHI inherits the host Haskell system's support for graph reduction by implementing the fixed combinators as normal Haskell functions. Unfortunatly, Turner's algorithm [2] for compiling to combinator expressions results in long-winded formulae when applied to Jansen-encoded data constructors: Nil = K Cons = B* (B K) C (C I) Even worse, for large algebraic data types, such as Nilsson and Nilsson's While syntax, data Stm = Skip | Ass Var Aexp | Comp Stm Stm | If Bexp Stm Stm | While Bexp Stm we get Skip = (B K) (((B* K) K) K) Ass = (B (B K)) (((B* ((B* (B K)) ((B* K) K))) C) (C I)) Comp = (B (B K)) (((B* ((B* K) ((B* K) K))) C) (C I)) If = (B (B ((B* K) K))) (((B* ((B* ((B* K) (B K))) C)) C) (C I)) While = (B ((B* K) K)) (((B* ((B* K) K)) C) (C I)) Ouch! Using some new combinators described below, we instead get Skip = K4 I Ass = G2 (X1 . K3) Comp = G2 (X2 . K2) If = G3 (X3 . K1) While = G2 K4 where . is function composition (Turner's B combinator) and Gn k v1..vn = k (\g -> g v1..vn) Xn k v1..vn = k Kn k x v1..vn = k x But n is unbounded, so each of these represent an infinite number of combinators. G' is required as an escape mechanism, to make a finite number of G sufficient. G'n k f v1..vn = k (\g -> f g v1..vn) (Note that Gn k = G'n k I) In other words, a big G can be made using a smaller G and many G'. G6 = G2 . G'2 . G'2 (More generally, n=m+o ==> G'n k = G'm (G'o k)) Likewise, a big X or K can be composed from smaller ones: X6 = X2 . X2 . X2 K6 = K2 . K2 . K2 The number of new combinators required is 3n+1. By increasing n, we obtain more but coarser combinators. Setting n = 3 gives a healthy performance boost at the cost of just 10 combinators. Now to motivate the new combinators. A data constructor of arity a which is the nth constructor in a data type containing m constructors is (Jansen) encoded as Canm v1..va w1..1m = wn v1..va The purpose of the G combinator is to slide a given expression (say g) to the front of a set of arguments v1..va, i.e. \g -> g v1..va. If a=2 for example, then G2 is used to group two arguments: G2 k v1 v2 = k (\g -> g v1 v2) The sliding function is passed to a continuation k, which must select the nth constructor from w1..wm. This can be done by ignoring the first (n-1) of ws X2 k w1 w2 = k (supposing n=3) and then picking the 1st argument of (1+m-n) remaining arguments K2 k x w1 w2 = k x (supposing m=5). Thus, C335 (the Comp constructor above) is encoded G2 (X2 . K2) (If case alternatives were stored in an array, a simple array lookup could be used in place of X and K.) Now for the comparison. The primed algorithm's have been extended with the combinators proposed here. (Size of combinator set in brackets.) PROGRAM TURNER (8) TURNER' (21) STOYE (1701) STOYE' (1714) Fib 1.25s 1.18s 1.13s 1.13s Queens 4.20s 3.96s 2.28s 2.16s Sudoku 1.47s 1.10s 0.94s 0.85s PermSort 4.23s 3.07s 2.13s 1.97s SumPuz 3.82s 3.09s 2.84s 2.63s OrdList 7.10s 5.57s 5.14s 4.55s Simplifier 7.00s 5.51s 4.68s 4.10s REFERENCES [1] David Turner. A new implementation technique for applicative languages. Software: Practice and Experience Volume 9, Issue 1, Pages 31-49. [2] Edsger W. Dijkstra. A mild variant of Combinatory Logic. EWD735, May 1980. [3] Richard Kennaway and Ronan Sleep. Director strings as combinators. ACM Transactions on Programming Languages and Systems, volume 10, number 4, 1988. [4] William Stoye. The Implementation of Functional Languages using Custom Hardware. PhD Thesis, University of Cambridge, 1985. [5] Jan Martin Jansen, Pieter Koopman and Rinus Plasmeijer. Efficient Interpretation by Transforming Data Types and Patterns to Functions. Trends in Functional Programming, Volume 7, Intellect, 2007.