Reach

Finding Inputs that Reach a Target Expression


*** UNDER CONSTRUCTION ***


Contents

Introduction

The Reach problem, as concocted by Colin Runciman, is to generate example function applications that cause evaluation of explicitly-marked target expressions in a program. This webpage presents a tool, called Reach, based on Lazy Narrowing, for solving the Reach problem. Reach has a wide range of uses including property refutation, assertion breaking, program crashing, program covering, program understanding and powerfull custom data generation.

Reach by Example

Example 1 Let's take an excerpt from the CircLib module of the Reduceron. In particular, we'll take the following circuit for testing equality of two binary-encoded numbers.
module CircLib where

import Reach

tree            :: (a -> a -> a) -> [a] -> a
tree f [x]      =  x
tree f (x:y:ys) =  tree f (ys ++ [f x y])

(/=/)   :: [Bool] -> [Bool] -> Bool
a /=/ b =  tree (&&) (zipWith (==) a b)
We compare the individial bits of each number for eqaulity (using xnor gates), feeding the outputs into a tree of and gates. Now we'd like to verify that our implementation is correct, so we write the following property.
prop_eq a b = (a /=/ b) == (num a == num b)

num        :: [Bool] -> Int
num []     =  0
num (a:as) =  fromEnum a + 2 * num as
We can use Reach to try to refute this property. To this aim, we need to insert a target at the appropriate place in the program. Targets can be inserted simply by calling the function
target :: a -> a
which is exported by the module Reach.hs. Also exported by that module are several abstractions over target, including
refute       :: Bool -> Bool
refute False =  target False
refute True  =  True
Now we can define
main a b = refute (prop_eq a b)
and use Reach to search for counter examples to prop_eq. To invoke Reach, we first need to compile CircLib.hs to Yhc.Core at the command prompt (#).
# yhc --linkcore CircLib.hs
Compiling Reach            ( Reach.hs )
Compiling CircLib          ( CircLib.hs )
Linking...
Reach can be applied straightforwardly to the resulting yca core file.
# reach CircLib.yca
main (True : (True : [])) (True : [])
main (False : (True : [])) (False : [])
main (True : []) (True : (True : []))
main (False : []) (False : (True : []))
Aha! A bug? Actually, the property we have written is simply not true when the input lists have different sizes. Let's amend it.
prop_eq a b = length a == length b ==> (a /=/ b) == (num a == num b)
Here, ==> is standard boolean implication. Let's recheck the property.
# reach CircLib.yca
That's better, there aren't any examples. Or are there? By default Reach places a unfold bound of 3 on recursive function calls. We might like Reach to try a little harder by increasing the bound:
# reach -r5 CircLib.yca
Looks good, there are definitely no examples, at least up to recursion depth 5. Or are there? What if prop_eq is a partial function? We can check by passing the --crash option to Reach which causes a target to be placed around all calls to error in the program.
# reach --crash CircLib.yca
main [] []
Of course! The function tree isn't defined on the empty list! Let's amend the property, again
prop_eq a b = not (null a) && not (null b) && length a == length b
                ==> (a /=/ b) == (num a == num b)
and recheck.
# reach --crash CircLib.yca
Phew! That's better, the property cannot be falisifed up to recursion depth 3, and nor can it be crashed. So without the --crash option, Reach verified the property for partial correctness. Being able to check just for partial correctness is sometimes very handy, after all, we often write partial functions such as tree.

Ok, so what have we achieved? We could have just used QuickCheck or SmallCheck here, right? Well yes, but just take a look at the antecedent on the final version of prop_eq -- it's big, and quite restrictive. The problem is that many blindly generated inputs are not likely to satisfy it, and furthermore, antecedents even more restrictive than this one arise frequently in property-based verification. For fair testing, we really want to generate inputs that satisfy the antecedent -- we all know that property will hold when the antecedent is false!

Therefore, in QuickCheck or SmallCheck the developer might feel urged to define a custom data generator to generate input lists of the same size. In general, this means more work for the developer, and the possibility of introducing a bug when writing the generator. Using Reach, we needn't worry about this! Reach uses Lazy Narrowing to find targets, and as a result, solves antecedents directly, cutting out large regions of the input space that falsify it.

Let's check this final claim with a quick experiment.

MethodData Depth
3456
Enumerative Testing0.2s1.2s6.4s 38.7s
Reach0.1s0.5s3.0s21.9s

A primary goal of Reach is to avoid having to write custom data generators. Let's look at some more evidence that Reach can help achieve this goal.

Example 2 Let's take another example from CircLib, a binary multiplexor.

unaryMux          :: [Bool] -> [[Bool]] -> [Bool]
unaryMux sel xs   =  map (tree (||))
                  $  transpose
                  $  zipWith (\s x -> map (s &&) x) sel xs

decode []         =  [True]
decode [x]        =  [not x,x]
decode (x:xs)     =  concatMap (\y -> [not x && y,x && y]) rest
  where
    rest          =  decode xs

binaryMux         :: [Bool] -> [[Bool]] -> [Bool]
binaryMux sel xs  =  unaryMux (decode sel) xs

Now let's check the correctness of binaryMux. Recall our claim that complex antecedents often arise in property-based testing. Well, here is an illustrative example.

prop_binMux sel xs = length xs == 2 ^ length sel
                       && all ((== length (head xs)) . length) xs)
                         ==> (binaryMux sel xs == xs !! num sel)

main sel xs = refute (prop_binMux sel xs)

Reach is unable to refute this property below depth 11. Now look at the timings compared to exhaustive testing.

MethodData Depth
34510
Enumerative Testing2.4s107.0s   
Reach0.3s0.3s0.4s3.4s

The more restrictive the antecedent, the greater need to write a custom data generator, and the greater the benefit of Reach.

Example 3 Let's complement the CircLib library with a little behavioural description library, Flash. Flash provides a number of behavioural circuit abstractions, such as sequential and parallel composition, if statements and while loops.

Now suppose that we're adding a loop optimisation to Flash. To test our optimisation, we may wish to test the library with Flash programs that contain while loops that execute their bodies at least once. Normally one would have to write a custom data generator for this. But we can use Reach to generate such Flash programs very easily. First we make use of another abstraction over target which is exported from Reach.hs.

targetWhen         :: Bool -> a -> a
targetWhen True a  =  target a
targetWhen False a =  a

Now all we need to do is place a target at the point where while loops are treated, and express the constraint that the body gets executed.

circuit (While cond prog) start =  targetWhen
                                     (or (start <&> cond))
                                     (out, finish)
  where (out, finish')          =  circuit prog start'
        active                  =  start <|> finish'
        start'                  =  cond <&> active
        finish                  =  inv cond <&> active

Expressing the constraint is straigtforward, as all the information we need to constrain is in scope. In particular, we just want the start signal and condition signal to be true on at least one clock cycle. Furthermore, with a small change can strengthen the constraint so that only programs with terminating while loops are generated.

# reach Flash.yca
...
Many results are returned. To illustrate, here is one.
main (Wait :>> (Wait :>> (While (_469 : (_491 : (True : _511))) _444)))

Notice that the terms begining with _, such as _469, represent unbound variables -- they could be instantied to anything well-defined and the target would be reached. Notice also that the while loop's condition is true on the third clock cycle, coninciding with its start signal (Wait statements consume one clock cycle and :>> represents sequential composition).

So complex data generators can be rendered rather simple using Reach.

Summary Hopefully we have given a taste of Reach's capabilities, showing that it can often obviate the manual development of custom data generators, a potentially frequent time-consuming and error-prone programming task. For more examples, see our SCAM'07 paper, discussed next. Experimenting with Reach by applying it to more examples is an ongoing task.

Papers and Presentations about Reach

SCAM'07 paper: pdf

SCAM'07 slides: pdf

IFIP WG2.8 Meeting, 2007, slides (by Colin): pdf