Stepney: High Integrity Compilation

1. Introduction

1.1 The problem

If the failure of a piece of software could result in injury or loss of life, significant damage to equipment, severe financial loss, or environmental damage, then that software is safety critical . Because the failure of such software is potentially so harmful, it is essential to minimize the chance of its failure---it becomes a high integrity application.

It has been argued that the only 'safe' way to write high integrity applications is by using assembly language, because only assembly language is close enough to the hardware that one can be sure about what is supposed to happen during program execution, and because only assemblers (that translate assembly language to machine code) are simple enough to be validated, and hence trusted.

There are two major reasons given for distrusting high level languages, which are further removed from the hardware than assembly languages. Firstly, high level languages are complex and poorly defined. Their involved and ambiguous semantics makes it impossible to know what a program written in such a language means, to know what it is supposed to do when it is executed. So it is impossible to know, looking at a program written in such a language, how it should be translated to execute on the 'real' machine; in some cases the only way to find out what a piece of code does is to 'run it and see'. Secondly, a compiler is itself necessarily a large complex piece of software, and will have bugs. So, even if it were possible to have some idea of the meaning of high level language programs, it is impossible to have any confidence that a compiler correctly implements this meaning. The problem is not restricted to just high level languages; some modern chips have such large complex instruction sets that their assembly languages and assemblers are not trusted for high integrity applications.

There is more than a grain of truth in this argument: some programming languages do have notoriously baroque semantics, and their respective compilers are large complex pieces of software, with all the corresponding potential for errors that implies. But as high integrity applications grow larger and more complex, use of assembly language is becoming infeasible; the large assembly language programs can exhibit as many bugs as the prohibition of high level languages sought to avoid. High level languages, with all their software engineering advantages, are becoming essential. How can these conflicting requirements be reconciled?

As a first step along the way (at least) the following conditions need to be met:

The high level source language must have a target-independent meaning; it must be possible to deduce the logical behaviour of any particular program, independent of its execution on a particular target machine.
This implies, among other things, that the source language must have a mathematically defined semantics . Otherwise it is impossible to deduce even what should be the effect of executing a particular program.
The target machine language must also have a mathematically defined semantics. Otherwise it is impossible to prove that the compilation translation is correct.
The compiler from the source to the target language must be correct. Hence it must be derived from the semantics of both the source language and the target machine's language.
To permit validation , the compiler for a high integrity language must be seen to be correct . It must be written clearly, and must be clearly related to the semantics.
The target code produced by the compiler must be clear, and easily related to the source code. This gives the visibility to the compilation process that is a requirement for high integrity applications.
The semantics for both the source and target languages must be made available for peer review and criticism.

The last three points are important for high integrity applications, in order to conform with the much more stringent validation and visibility requirements these have.

The requirement for a visible link between the source code and target code is most easily met by imperative-style source languages, because their state-based models map well onto most underlying hardware. Higher level languages, such as declarative languages, are much further removed from the machine, and it is correspondingly harder to demonstrate the link. Hence the source language described later in this book is an imperative one.

In addition to the above requirements, the equally thorny problems of proving correct the high integrity application being written in the high level language, and of showing that the physical hardware correctly implements the meaning of its assembly language, must be addressed. These are beyond the scope of this book.

One approach for constructing a high assurance compiler from the mathematical specifications of the semantics of the source and target languages is described in this book. This is done by defining Tosca, a small, but non-trivial, high level language, and Aida, a typical assembly language, then constructing a compiler using their definitions. Note that this is not a proposal for a new high integrity language, nor is it a claim that this approach is the way to write general-purpose compilers. Rather, it demonstrates how the mathematical specification of a given language can be used in high integrity compiler development.

1.2 Semantics

In order to write a correct compiler, it is necessary to have a mathematically defined semantics of both the source and the target languages. There are several ways of defining the semantics of programming languages, each appropriate for different purposes. Not every form is equally suitable for the purpose of defining a language in such a way that a high integrity compiler can be clearly derived from it.

1.2.1 Axiomatic semantics---too abstract

Axiomatic semantics defines a language by providing assertions and inference rules for reasoning about programs.

Assertions can be expressed as 'Hoare triples':

{ P } S { Q }

where S is a program fragment, and P and Q are predicates over program states. The triple asserts that if the pre-condition P holds before the execution of program fragment S , and if S terminates, then the post-condition Q holds afterwards. Inference rules look like

H ₁ , H ₂ , ..., H _n
-----------------
H

This states that if H ₁ , H ₂ , ..., H _n are true, then H is also true. So, when reasoning about a program, in order to prove H , it is sufficient to prove H ₁ , H ₂ , ..., H _n . This defines the meaning of H .

For example, the inference rule for an if construct might be given as something such as

{ P and e } ct { Q } , { P and not e } cf { Q }
------------------------------------------------
{ P } if e then ct else cf { Q }

This rule states that, when reasoning about an if construct, in order to prove the post-condition Q is established, it is sufficient to prove that it is established by ct whenever e holds, and that it is established by cf whenever e does not hold (in each case, assuming the common pre-condition P holds, too).

Such a style of definition is appropriate for showing that two programs have the same meaning, for example, that they establish the same post-condition. This is useful, for example, for reasoning about programs, such as defining meaning-preserving program transformations for the purpose of correct optimization. However, it is an indirect form of definition, and is not so useful for defining a language in a form suitable for direct implementation in a high integrity compiler. It is rather too abstract for our purposes.

1.2.2 Operational semantics---too concrete

Operational semantics defines a language in terms of the operation of a (possibly abstract) machine executing the program, and so is mostly concerned with implementations. For example, it might define the meaning of an if construct such as

if e then ct else cf

in terms of labels and jumps

      < e >
      JUMP label1
      < ct >
      GOTO label2
   label1:
      < cf >
   label2:

where the program fragments in angle brackets should be replaced with their operational semantics definitions, recursively. JUMP transfers control to the relevant label if the previous expression evaluates to false ; GOTO is an unconditional jump.

Such a definition is useful, for example, when writing a compiler for that particular machine. However, it is not so good for defining what that meaning actually is, because it is defined only in terms of another programming language, which itself needs a definition. Also, a separate operational semantics is needed for each target machine. There then arises the problem of consistency: how can you be sure all the various semantics are defining the same high level language?

So, although such a definition is ultimately required for defining a compiler, it is too concrete to be an appropriate starting point: a machine-independent definition of the language.

1.2.3 Denotational semantics---just right

Denotational semantics defines a language by mapping it to mathematics. Such a mathematical definition should be better-understood and better-defined. So the language is defined by building a mathematical model that defines 'meaning functions'. Each meaning function maps a type of language construct to a mathematical value. The mathematical values used can have simple types like numbers, and also more structured types like state transition functions. The language construct denotes this mathematical value; the value is the 'meaning' of the construct.

The mathematical model suitable for such an imperative language is one of states and state transitions. For example, the denotation of an if construct would be defined in terms of the mathematical meanings of its component constructs:

D [| if e then ct else cf |] p s =
D [| ct |] p s , if D [| e |] p s = true
D [| cf |] p s , if D [| e |] p s = false

Here p is the environment, a mapping of program identifiers to abstract locations, and s is the state, a mapping from abstract locations to mathematical values. D [| |] is a function that maps program language commands, in an environment and state, to a new state. Hence, it maps program syntax to mathematical values. (The full notation is defined in later chapters.)

The denotational approach of modelling abstract meanings of programs, independent of any machine implementation, satisfies the requirement that a program must have the same logical behaviour no matter what hardware it runs on. It is also at just the right level of abstraction to the starting point for specifying a high integrity compiler.

In later chapters, the denotational semantics of two example languages, the source language Tosca and the target language Aida, are specified.

1.2.4 Non-standard semantics

The denotational semantics described above is a 'standard' semantics: it describes the standard meaning usually associated with a program, that of program execution. It is also called 'dynamic' semantics, because it defines the dynamic changes of state that occur as a program executes.

But it is up to the language designer what the meaning of the language is chosen to be; other 'non-standard' meanings can equally well be defined. Each different meaning thus specified provides a different semantics for a language; the dynamic semantics is just one possible semantics, with a meaning that determines the output values when a program is executed. The best known non-standard meaning defines a type, rather than a value, to be associated with each construct. This 'type-semantics' can be used to determine whether a program type checks. Other non-standard meanings can be defined and used to determined other well-formedness conditions on a program, for example, that every variable is initialized to some value before it is used in an expression. These sort of semantics are called 'static semantics', because they define 'static checks', checks that can be made without executing the program, for example, at compile time.

This is described in more detail in later chapters. In particular, three static semantics for Tosca are specified---declaration-before-use, type checking, and initialization-before-use---and used to implement three static checkers.

1.2.5 Size of task

Before deciding to go the fully formal route of specifying source and target languages, and deriving a correct compiler, we need to know that the task is feasible. How big is the task of mathematically specifying a programming language?

Consider Modula-2, whose denotational semantics have been specified in VDM (more accurately, in the functional subset of Meta-IV, which is essentially a programming language); its specification runs to hundreds of pages. But remember, Modula-2 was specified retrospectively, and its definition was required to keep close to the existing semantics as defined operationally by its various compilers. Features that were difficult or 'messy' to specify had to be included, even where a different approach could have led to a simpler, cleaner, more understandable specification.

For high integrity applications, it is imperative that every potential for misunderstanding and error be reduced to a minimum. So the application language needs to be designed with a coherent and intelligible semantics. It can be argued that if a language feature is difficult to specify cleanly, it is difficult to understand, and hence should not be included in a language used for writing high integrity applications. Note that the converse does not apply: that a particular feature is easy to specify is not sufficient reason for including it in a high integrity language. Although the design of such a language could start from scratch, it is probably more practical to subset an existing language, slicing away those areas of ambiguity and confusion, not fossilizing them in the final definition. Such a language, although probably of a similar size to Modula-2 in terms of syntax , would be much smaller and simpler semantically .

Tosca itself is smaller than a real high level language, but even so still has quite a substantial specification, including as it does one dynamic and three static semantics.

The target language can be very small. Although many microprocessors have elaborate instruction sets, it is not necessary to specify the semantics of every one of these instructions. Only the subset of the instructions needed to implement the high level language need be specified. No static semantics need be defined either; all the static checks can be done in the high level language, and a correct translation introduces no new errors, so only the dynamic meaning need be considered. Even so, the specification tends to be more complex: the instruction set contains jump instructions, and the semantics of jumps are difficult to specify. Aida is typical of a pruned instruction set language.

1.3 From semantics to a compiler

The compiler's job is to translate each high level construct, such as

if < test_expr > then < then_cmd > else < else_cmd >

into a corresponding target language template, such as

      <test_expr translation>
      JUMP label1
      <then_cmd translation>
      GOTO label2
   label1:
      <else_cmd translation>
   label2:

where the program fragments in angle brackets are similarly translated, recursively. Specifying the compiler from a source to a target language consists of defining an operational semantics, in the form of a target language template for each source language construct. An obvious question arises: how can one have any confidence that these are the correct target language templates? For example, how can one have any confidence that the if statement above and its compiled translation have the same meaning?

It is possible to answer this question. Given the denotational semantics of the target language, it is possible to calculate the meaning of the template in the target language. This can be compared with the meaning of the corresponding high level fragment, and be shown to be the same (see Chapter 13). Proving that the correct templates have been specified reduces to calculating the meaning of every template, and showing it to be the same as the meaning of the source language construct. Notice that this approach provides a structuring mechanism for the proof process. Arguments are advanced on a construct-by-construct basis, using structural induction over the source language. The complete proof is constructed by working through the syntax tree of the language, until all constructs have a suitable argument supporting them; this then completes the argument in support of the compiler specification as a whole. Hence the complete proof is composed, using a divide-and-conquer strategy, from a number of smaller, independent subproofs. This structure makes the total proof much more tractable, and more understandable, than would a single monolithic approach. Structure is indispensable when doing a large proof.

1.4 Executable specification language---Prolog

There are various notations available for writing denotational semantics, including the conventional mathematical notation, convenient for reasoning about the correctness of the compiler templates.

If the denotational semantics specification is translated into an executable language, then executing it provides an interpreter for the specified language. This interpreter can, if desired, be used as a validation tool for checking that the formal mathematical specification defines a language with the appropriate informally expected behaviour.

It is possible, given a denotational semantics in some abstract notation, to translate it into an imperative language, such as Pascal, in order to produce an interpreter. However, because the style of such a language is so far removed from the style of the denotational specification, the mapping from the specification to the resulting implementation is very complex. Such a complex mapping process is in itself potentially error-prone, and does not produce an interpreter that is a transparently correct implementation of its specification. Furthermore, the correspondence between it and the operational semantics needed for the associated compiler is not obvious.

A better approach is to translate the denotational semantics into a much higher level programming language, one that closely matches the style of specification. This drastically reduces the complexity of the translation step, enabling the semantics to be written clearly and abstractly, in order to provide a transparently correct implementation of the interpreter. A functional language or logic language seems a natural choice.

Prolog is used here as the executable specification language for the Tosca compiler; there is a natural mapping from the denotational semantics into Prolog clauses. Also, an important consideration for a high integrity compiler that must be seen to be correct, Prolog is a mature language that is well supported and has a large user community. The use of an unproven Prolog system, however, is a weak link in the development process, as is the use of an unproven operating system and unproven hardware. The various sources of errors are discussed more fully in Chapter 15 . What is being described here is how to strengthen one of the currently weakest links in the process.

Prolog does have some features whose misuse could obscure the mapping from specification to compiler. The most notorious of these is the 'cut' (written ! ). Cuts are used to increase execution efficiency by controlling backtracking and execution order, and are considered by some as the Prolog equivalent of the 'goto'. However, some cuts are worse than others. Prolog has two semantics: a simple declarative semantics, which is independent of the order in which clauses and goals are written, and is used to reason about the meaning of a program, and a more complex operational semantics, which defines an execution order, and says what happens when a program is executed. It is important for understandability that these two semantics give a program the same meaning. So-called 'green' cuts do maintain this property. 'Red' cuts, on the other hand, result in the two meanings being different. The operational one is (presumably) the desired meaning, but the simpler declarative reading can no longer be used to determine what this meaning is. In order to understand the program it becomes necessary, for example, to know in what order goals are evaluated.

It is important that the Prolog form of a high integrity compiler has the same declarative reading (equivalent to the specification) as operational reading (which provides the executable compiler). So the Prolog must be written in a disciplined manner, eschewing red cuts and other tricks, if necessary sacrificing speed of execution.

1.5 Further reading

Early work on compiler correctness includes [McCarthy and Painter 1966] , [Milner and Weyhrauch 1972] , [Morris 1973] , [Cohn 1979] , [Polak 1981] . Work on automatically generating a compiler from a definition of the language's semantics includes [Mosses 1975] , [Paulson 1981] , [Paulson 1982] , [Wand 1984] , [Lee 1989] and [Tofte 1990] . The Esprit supported ProCoS (Provably Correct Systems) project has investigated an algebraic approach to correct compilation, described in [Hoare 1991] .

Many seminal papers on the axiomatic style of defining programming languages can be found in [Hoare and Jones 1989] . [Tennent 1991] describes the connection between denotational, operational and axiomatic semantics.

For an introduction to denotational semantics, see, for example [Gordon 1979] , [Allison 1986] and [Schmidt 1988] . The classic description is [Stoy 1977] . Various static semantic analyses are described in [Cousot and Cousot 1977] , [Bramson 1984] and [Bergeretti and Carré 1985] . [Allison 1986] gives examples of translating denotational semantics definitions into Pascal to provide an interpreter. [Stepney and Lord 1987] describes an example of executing a specification in order to validate it.

The formal specification of Modula-2 can be found in [Andrews et al . 1991] . [Carré 1989] discusses criteria for including features in high integrity languages.

Anyone still not convinced that natural language is totally unsuitable for rigorously and unambiguously specifying even a simple problem should read [Meyer 1985] . This is an entertaining account of the repeated failed attempts to use English to specify a seemingly trivial problem, and how formalism can help.

There are many good books introducing Prolog. See, for example [Sterling and Shapiro 1986] and [Clocksin and Mellish 1987] . The former discusses green and red cuts.

Susan Stepney.
High Integrity Compilation : a case study .

Prentice-Hall, 1993.

1. Introduction

1.1 The problem

1.2 Semantics

1.2.1 Axiomatic semantics---too abstract

1.2.2 Operational semantics---too concrete

1.2.3 Denotational semantics---just right

1.2.4 Non-standard semantics

1.2.5 Size of task

1.3 From semantics to a compiler

1.4 Executable specification language---Prolog

1.5 Further reading

Susan Stepney. High Integrity Compilation : a case study .

Prentice-Hall, 1993.

1. Introduction

1.1 The problem

1.2 Semantics

1.2.1 Axiomatic semantics---too abstract

1.2.2 Operational semantics---too concrete

1.2.3 Denotational semantics---just right

1.2.4 Non-standard semantics

1.2.5 Size of task

1.3 From semantics to a compiler

1.4 Executable specification language---Prolog

1.5 Further reading

Susan Stepney.
High Integrity Compilation : a case study .