[High Integrity Compilation] Susan Stepney.
High Integrity Compilation : a case study .

Prentice-Hall, 1993.

3. Using Prolog

3.1 Modelling Z in Prolog

The specification of the operational semantics, the translation from source to target language, specifies the compiler. A Prolog version of the specification directly gives a compiler. If we can find a clear translation from Z to Prolog, this strengthens one of the links in the chain of developing a high integrity compiler: the implementation follows directly from the formal specification.

Later sections in this chapter illustrate a direct translation of the specification of Turandot's commands into Prolog, and explains why this approach needs to be extended by a structuring mechanism: DCTGs. But first, let's see how Z sets are modelled in Prolog.

Z's sets are modelled as Prolog lists. So, for example, the Z set {1,2,3} can be represented in Prolog as the list [1,2,3] . Z functions are also sets (of pairs); they too can be modelled as Prolog lists (of lists). So the state function { x |--> 3, y |--> 9} can be represented in Prolog as [ [x,3],[y,9] ] .

Various predicates to manipulate such lists representing sets, such as union and intersection , are supplied in many Prolog libraries. These have been written to ensure that the order inherent in a Prolog list is not significant when it is used to model a set. Other predicates to perform Z-specific operations must be specially written. For example, a definition to update a state with a (name, value) pair can be written as

   updatestate(Name, Value, Pre, Post) :-
      setminus(Pre, [Name, \_], Mid ),
      union(Mid, [ [Name, Value] ], Post).

(In Prolog, names beginning with capital letters are variables.) Here setminus(A,B,C) is a previously-written clause that obeys A \ { B } = C . If the clause is executed with A and B given, C will be instantiated to the set A with element B removed. Mid is instantiated to Pre with pairs whose first element is Name and second is anything (indicated by the underscore) removed.

Similarly, union(A,B,C) obeys A u B = C . If the clause is executed with A and B given, C will be instantiated to the set union of A and B . Post is instantiated with the union of Mid and the set containing the single pair [Name,Value] (that the second parameter is a set, not an element, explains the double pair of square brackets).

Prolog has the powerful feature that, if a different selection of parameters is provided, the remaining ones will be instantiated. So if union is provided with B and C , it will instantiate A so that A u B = C is true. Powerful though this feature is, it is not exploited in the following definitions, where the parameters are always supplied in the 'forward' order corresponding to program execution.

3.2 Writing semantics in Prolog---a first attempt

With a suitable set of library definitions like those above, there is a straightforward translation from the semantics definitions in Z to ones in Prolog. (The rules for this translation in the case of Tosca are explained in Chapter 14.)

3.2.1 Dynamic semantics---an interpreter

Consider the Turandot example from Chapter 2 . The specification of the dynamic semantics of Turandot's commands ( section 2.3.4 ) can be written in Prolog as follows.

For commands, the Prolog form is

   command(gamma, PreState, PostState) :-
      < expression relating PreState and PostState >

with one clause for each sort of command gamma in the abstract syntax definition. PreState is the state before the command, PostState is the state that results from executing the command.

skip is the easiest to translate, because the state is not changed:

   command(skip, PreState, PostState) :-
      PreState = PostState.

This clause declares that, for a skip command, the PreState and PostState have the same value, whatever that value is. If the clause is being executed, as part of a Turandot interpreter, a value would be supplied for PreState , and PostState would become instantiated with the same value.

   command(assign(Name,Expr), PreState, PostState) :-
      expr(Expr, PreState, Value),
      updatestate(Name, Value, PreState, PostState).

This clause declares that, for an assign command, Value is the value of Expr in the PreState , and that the PreState updated with Name and Value is the PostState . If the clause was being executed, the values for Name , Expr and PreState would be supplied, and used to instantiate Value and PostState appropriately.

The choice command becomes

   command(choice(Expr,Then,Else), PreState, PostState) :-
      expr(Expr, PreState, Value),
      (  Value = 1,
         command(Then, PreState, PostState)
      ;
         Value = 0,
         command(Else, PreState, PostState)
      ).

Value is the value of Expr in the PreState . Depending on the value of Value , either the PostState is found using the Then parameter, or (as indicated by the Prolog ' ; ') by using the Else parameter.

   command(compose(Cmd1,Cmd2), PreState, PostState) :-
      command(Cmd1, PreState, MidState),
      command(Cmd2, MidState, PostState).

The MidState is found from the PreState using Cmd1 , and is used along with Cmd2 to produce the PostState .

If these clauses are read declaratively, they provide a Prolog specification of the dynamic semantics. If they are read operationally (or executed), they directly provide an interpreter for the language. To illustrate this dichotomy, the English explanation given above changes tone from declarative to operational as the example progresses.

3.2.2 Static semantics---a use checker

Turandot's static semantics can be written in Prolog in a very similar manner. The s-meaning of its commands ( section 2.4 ) can be written as:

   scommand(skip, PreSState, PostSState) :-
      PreSState = PostSState.
   scommand(assign(Name,Expr), PreSState, PostSState) :-
      union(PreSState, [Name], PostSState).
   scommand(choice(Expr,Then,Else), PreSState, PostSState) :-
      scommand(Then, PreSState, SStateThen),
      scommand(Else, PreSState, SStateElse),
      intersection(SStateThen, SStateElse, PostSState).
   scommand(compose(Cmd1,Cmd2), PreSState, PostSState) :-
      scommand(Cmd1, PreSState, MidSState),
      scommand(Cmd2, MidSState, PostSState).

If these clauses are read declaratively, they provide a Prolog specification of the static semantics. If they are read operationally (or executed), they directly provide a static checker for the language.

3.2.3 Code templates---a compiler

The specification of Turandot's compiler can be written in Prolog in an analogous manner. Rather than defining a PostState , however, the Prolog translation of the commands ( section 2.5 ) defines a list of target language instructions.

   compile(skip, [ ]).
   compile(assign(Name,Expr), [InstrList, store(Name)]) :-
      compile(Expr, InstrList).
   compile(choice(Expr,Then,Else),
         [InstrExpr, jump(L1), InstrThen,
         goto(L2), label(L1), InstrElse, label(L2)]
   ) :-
      compile(Expr, InstrExpr),
      compile(Then, InstrThen),
      compile(Else, InstrElse).
   compile(compose(Cmd1,Cmd2), [InstrList1, InstrList2]) :-
      compile(Cmd1, InstrList1),
      compile(Cmd2, InstrList2).

If these clauses are read declaratively, they provide a Prolog specification of the compiler translation. If they are read operationally (or executed), they directly provide the compiler.

3.3 Definite Clause Translation Grammars

Although this sort of translation into Prolog is quite straightforward for such a small example, it soon becomes unwieldy. For example, the syntax and semantics definitions of each construct are closely intertwined, and the syntax has to be repeated in each of the different semantics. A better structuring mechanism is needed for a full language with multiple semantics.

Many standard Prologs have a mechanism called Definite Clause Grammars (DCGs) built into them, to allow expressions such as

   sentence --> noun_phrase, verb_phrase.
   noun_phrase --> determiner, noun.

to be manipulated. These are automatically converted, by the Prolog system, into their standard Prolog equivalents with the necessary extra arguments:

   sentence(S0,S) :- noun_phrase(S0,S1), verb_phrase(S1,S).
   noun_phrase(S0,S) :- determiner(S0,S1), noun(S1,S).

This approach is suitable for defining syntax. For defining semantics as well, even more arguments are needed in the standard Prolog form. These can be provided automatically by using Definite Clause Translation Grammars (DCTGs).

DCTGs provide a general mechanism for grammar computations. The process of successfully parsing a particular input program causes a parse tree } to be built. The semantic rules are attached to the non-terminal nodes of this parse tree, and used to define the semantic properties of a node in terms of the semantic properties of its subtrees.

Although not directly supported by the Prolog system in the same way as DCGs, Prolog operators and predicates can be defined to support the DCTG approach. These translate a program written in the DCTG form into an equivalent program in standard Prolog. As with the DCG form, this translation provides the various clauses with extra arguments, which are used to define how to build the annotated parse tree.

As a simple example, consider a possible DCTG definition for adding two expressions to produce an expression:

   expr ::= expr^^Tree1, tPLUS, expr^^Tree2
   <:>
   value(V) ::-
      Tree1^^value(V1),
      Tree2^^value(V2),
      V is V1 + V2.

The first part of the term (before the <:> ) defines the concrete syntax. In this example it says that an expression can be a subexpression, a 'plus' token and another subexpression. The subexpressions are labelled with their parse trees. The second part of the term defines the semantics of the composite expression in terms of its subexpressions. Here it says the value of the composite expression is the arithmetic sum of the values of the two subexpressions.

The translation into plain Prolog looks like:

   expr( node( expr, [Tree1, Tree2],
         [value(V) ::-
            Tree1^^value(V1),
            Tree2^^value(V2),
            V is V1+V2]
         ),
      S0, S ) :-
         expr(Tree1, S0, S1),
         expr(Tree2, S1, S).

At first sight, this technique may look more complicated than using plain Prolog. It does, however, have the advantage of cleanly separating the syntax and semantics. Another important advantage of this approach is that a DCTG can be used to support multiple sets of different semantics attached to each node, as extra elements in the node list. So Turandot's choice command can be written using the DCTG formalism, including the code template (compiler), dynamic semantics (interpreter), and static semantics (static checker), as

   command ::=
      tIF, expr^^E,
      tTHEN, command^^Cthen,
      tELSE, command^^Celse
   <:>
      (static(PreSState, PostSState) ::-
         Cthen^^static(PreSState, SState1),
         Celse^^static(PreSState, SState2),
         intersection(SState1, SState2, PostSState)
      ),
      (dynamic(PreState, PostState) ::-
         E^^dynamic(PreState, Value),
      (  Value = 1,
            Cthen^^dynamic(PreState, PostState)
         ;
            Value = 0,
            Celse^^dynamic(PreState, PostState)
         )
      ),
      (compile( [InstrExpr, jump(L1), InstrThen,
         goto(L2), label(L1), InstrElse, label(L2)] )
      ::-
         E^^compile(InstrExpr),
         Cthen^^compile(InstrThen),
         Celse^^compile(InstrElse)
      ).

Notice that the non-standard static semantics static is attached to the DCTG nodes in the same manner as the dynamic semantics dynamic and the operational semantics compile . Other non-standard semantics for various other types of analyses can be incrementally added to each node in a similar, consistent manner. When executed, each provides a further static analysis checker.

Notice how, with the DCTG approach, the various semantics are in similar forms, and occur textually close together in the Prolog code. This is of great advantage in the process of demonstrating the correctness of the translation from the mathematical specification to the executable compiler.

Chapter 14 gives the DCTG form of the definition of Tosca's various static and dynamic semantics, and its operational semantics in terms of Aida.

3.4 Further reading

For a good introduction to Prolog, see, for example [Sterling and Shapiro 1986] and [Clocksin and Mellish 1987] . [ Clocksin and Mellish 1987 , Chapter 9] describes DCGs.

The style used in section 3.2 follows [Warren 1980] .

DCTGs are described in [ Abramson and Dahl 1989 , Chapter 9], and the algorithm to convert a DCTG program to Prolog is listed in [ Abramson and Dahl 1989 , Appendix II.3]. The DCTG approach is the logic programming equivalent of Attribute Grammars [Knuth 1968] . The DCTG formalism does not distinguish between inherited and synthesized attributes, however, because Prolog's unification mechanism makes such a distinction largely unnecessary.