[High Integrity Compilation] Susan Stepney.
High Integrity Compilation : a case study .

Prentice-Hall, 1993.

6. Partitioning the Specification

6.1 Undefined meanings

Consider the following highly erroneous Tosca 'program':

b : bool;
x : int; y : int;
begin

x := ; - - syntax error
z := 1; - - z not declared
x := z; - - z not declared}
x := b; - - incompatible types
x := y; - - y not initialized



end

A program must be syntactically correct before any of its meanings (static or dynamic) are defined. So the meaning of the above fragment is undefined .

Even if the syntax error is corrected, there are many other things wrong, and the formal meaning of the program should still be undefined . Hence, the semantic meaning function is partial ; it is defined for only some programs, that satisfy certain well-formedness conditions. For example, the conditions relevant to an assignment command are (informally)

M E [| assignment |] =

if (target variable is declared)
/\ (source expression has no undeclared variables)
/\ (source and target have the same type)
/\ (source expression has no uninitialized variables)
then (definition of meaning)
else checkWrong




Checking that all these conditions are met in one lump, as above, results in a clumsy style of specification that seems to put more emphasis on what does not happen, than on what does. A better approach is to partition the specification into several logically separate static checks and a definition of the dynamic (execution) meaning. Such partitioning means that there is no need to check, for example, in the dynamic semantic definitions, that expressions are of the right type (they are; they have passed the type checking), or that variables have been initialized before they are used (they have; they have passed the initialization checking).

This ability to separate concerns puts structure on an otherwise large monolithic specification. It results in several simpler specifications, one for each semantics, since 'error cases' do not have to be considered each time. It gives a more uniform approach, since it can be clearly seen that each static check has been made for each construct. The separation also serves to highlight each of these semantics, showing its purpose.

Not every language can have its semantics partitioned so neatly; achieving a clean separation in an existing language is more difficult than designing the separation in from the start. Tosca has been designed to be separable in this way, and has defined three static semantics---declaration checking, type checking and use checking---and a dynamic semantics. This chapter gives an overview of the purpose of each of the semantics, by informally describing the kind of conditions that each one is designed to check. The following chapters specify Tosca by formally defining each of its four semantics.

6.2 Syntax

If a Tosca program is not syntactically correct, then its declaration-before-use, type, initialization-before-use and dynamic semantics are undefined .

6.3 Declaration-before-use semantics

The first static check made on a syntactically correct Tosca program is a declaration-before-use check, to make sure that any variables used in the program have been correctly declared. For example, an informal reading of type checking an assignment command is

D E [| assignment |] =

if (target variable is declared)
/\ (source expression has no undeclared variables)
then checkOK
else checkWrong


Only if the whole program passes this static check are the other semantics defined. Because this is the first check done, the declaration check meaning functions are in fact {\sl total}: this semantics is defined for any syntactically correct Tosca program.

If D P [| program |] = checkWrong , then the program does not pass its declaration-before-use check, and so its type, initialization-before-use and dynamic semantics are undefined .

6.4 Type-checking semantics

If a Tosca program passes its declaration-before-use check, it is subject to a type-check. The type-checking semantics defines conditions for a program to be well-typed: in the case of an assignment the check is (informally)

T E [| assignment |] =

if (source and target have the same type)
then checkOK
else checkWrong

In this definition, there is no need to first check that the variables have a type: since we know they have been declared, they do. Hence T E [| |] is a partial function: it is defined only for declaration-checked expressions, and undefined for others.

If T P [| program |] = checkWrong , then the program does not pass its type check, and so its initialization-before-use and dynamic semantics are undefined .

6.5 Initialization-before-use semantics

If a Tosca program passes its type check, it is subject to its final static check, that any variable that is accessed has previously been initialized to some value. For an assignment statement, for example, this consists of checking that there are no uninitialized variables in the source expression. Informally:

U E [| assignment |] =

if (source expression has no uninitialized variables)
then checkOK
else checkWrong

This provides a static initialization-before-use semantic check. Note that some of the formal definitions of Tosca provide a rather strict constraint on potentially unused variables (guilty until proven innocent), which eliminates programs that might otherwise be thought to be 'correct'. It is probably appropriate to have such a strict definition for a high integrity language. More to the point, however, it does provide an unambiguous definition, which can be reasoned about, and which provides a basis for criticism if necessary. If U P [| program |] = checkWrong , then the program does not pass its initialization-before-use check, and so its dynamic semantics are undefined .

6.6 Dynamic semantics

Only if a Tosca program passes its initialization-before-use check (and hence its declaration-before-use and type checks, too), is its dynamic meaning defined:

M E [| assignment |] =

(definition of meaning)

Compare this with the form given earlier . All the conditions to be satisfied have disappeared, because they have already been checked in the static semantics definitions. Hence, the definition is simpler and clearer.

6.7 Redundancy

Not all programming languages require variables to be declared, typed or initialized. For example, in the awk programming language, the first use of a variable implicitly declares it. If it is assigned a value of a particular type (number or string) it has that type. If its first use is on the right-hand side of an expression, it is automatically initialized to zero, or the empty string, as appropriate. This automatic declaration and initialization allows some very concise awk programs, free of clutter.

So why aren't all languages like this? A famous error occurred in a Fortran program controlling the launch of a Mariner space probe. One line of the program used a full stop instead of a comma. The line should have read

	DO 100 I = 1,10

This is the first line of a Fortran DO loop, and instructs the program to repeat the following code up to the line numbered 100, with I taking the values from 1 to 10. Instead, the line read

	DO 100 I = 1.10

This is an assignment statement, assigning the value 1.10 to the variable DO100I . (Fortran permits, and ignores, spaces in variable names.) If Fortran required variables to be declared before use, this error would have been caught by a static check; the spurious variable DO100I would have been flagged as undeclared. It was not caught, and an expensive spacecraft was lost.

So the main purpose of variable declarations and initializations is to provide a safety net; redundancy allows consistency checks. Static checks, that can be performed without having to execute the program, are particularly valuable.

6.8 Further reading

For a description of the awk language, see [ Kernighan and Pike 1984 , section 4.4] or [Aho et al . 1988] . The language's name is an acronym made from the names of its designers Aho, Weinberger and Kernighan.

The journal ACM SIGSOFT Software Engineering Notes often has notes and articles about computer-related incidents. The Mariner incident, along with many others, is listed in [Neumann 1985] .