SparseCheck

A Logic Programming Library for Test-Data Generation


News

  • October 2007. If you're interested in SparseCheck, take a look at Lazy SmallCheck too.
  • October 2008 Lazy SmallCheck supersedes SparseCheck, both in terms of useability and performance, so I'm a bit reluctant to invest more effort in SparseCheck. (SparseCheck still has potential advantages though --- for example it could work in a strict language such as Erlang or ML.)

Contents

Introduction

Have you ever written a program property that has a sparse input domain? That is, a property with an antecedent that is not satisfied by a large portion of the input space? If so, you may have noticed that testing it with QuickCheck or SmallCheck is problematic. Often these libraries can spend most of their time generating data that falsifies the antecedent, yet to find a counter example, the antecedent must be satisfied. In such cases, the programmer must write a custom data generator to obtain sufficient test coverage.

SparseCheck is a typed, depth-bounded logic programming library written in standard Haskell'98. It is closely based on the LP library described in our paper A Functional-Logic Library for Wired to appear at the Haskell Workshop. Using SparseCheck is one possible way of writing custom data generators for properties with sparse domains. We believe that SparseCheck fits nicely in this role, and the aim of this webpage to provide support for this claim. Unlike other systems that can deal with the ``sparse domain'' problem, such as Lindblad's property-directed generator and our target-directed generator Reach, both of which are static analysis tools, SparseCheck is standard Haskell'98 library that can simply be imported by the programmer.

Example 1: Generating ordered binary search trees

Let us start with a simple example. Suppose that we have defined a data type for binary search trees

  data Tree a = Empty | Node a (Tree a) (Tree a)
and a function for deleting an item from a binary search tree
  del :: Ord a => a -> Tree a -> Tree a
  del a Empty = Empty
  del a (Node b t0 t1)
    | a < b = Node b (del a t0) t1
    | a > b = Node b t0 (del a t1)
    | otherwise = ext t0 t1

  ext Empty t = t
  ext (Node a t0 t1) t = Node a t0 (ext t1 t)
For this function to be correct, the following property should hold
  prop_ordDel a t = ordered t ==> ordered (del a t)
where ordered is a predicate which checks that a given tree is indeed a binary search tree
  ordered :: Ord a => Tree a -> Bool
  ordered t = ord (const True) t
    where
      ord p Empty = True
      ord p (Node a t0 t1) = p a && ord (<= a) t0 && ord (>= a) t1
After testing prop_ordDel with QuickCheck or SmallCheck, the programmer might decide to write a custom data generator if insuffucient automatically-generated tests satisfy the antecedent. In this case ordered is not too difficult to satisfy, but later we'll see cases where satisfying the antecedent is really problematic for QuickCheck and SmallCheck.

So, to write a custom generator for ordered trees, we can use SparseCheck. First we need to define a logic programming data type for trees

  (empty :- node) = datatype (ctr0 Empty \/ ctr3 Node)
This introduces two logical constructors: empty :: Term (Tree a) and node :: Term a -> Term (Tree a) -> Term (Tree a). So normal Haskell values of type a correspond to SparseCheck values of type Term a. Furthermore, like in QuickCheck and SmallCheck, a type class must be instantated for a new data type:
  instance Convert a => Convert (Tree a) where
    term Empty          = empty
    term (Node a t0 t1) = node (term a) (term t0) (term t1)
    unterm              = converter (conv0 Empty -+- conv3 Node)
Instantiating this class is entirely mechanical -- no insight whatsoever is required. Now we can write a SparseCheck predicate that defines binary search trees:
  ordered :: Ordered a => Term (Tree a) -> Pred
  ordered t = ord (const true) t
    where
      ord p t = caseOf t alts where
                  alts (a, t0, t1) =
                        empty :-> true
                    :|: node a t0 t1 :-> p a & ord (|<=| a) t0 & ord (|>=| a) t1
The advantage of defining ordered using SparseCheck is that we can now generate ordered trees directly. For example, to generate all ordered trees of integers up to depth 3, we can simply write:
  Hugs> solve 3 ordered :: [Tree Int]
  [Empty,Node 0 Empty Empty,Node 1 Empty Empty,...
To check our property we first reformulate it slightly:
  cond_ordDel a t = ordered t
  prop_ordDel a t = lower ordered (del a t)
The function lower turns a logic programming predicate over terms into a boolean function over normal Haskell values. Checking the property for all inputs up to depth 3 can be done by typing
  Hugs> check2 3 cond_ordDel prop_ordDel
  []
The empty list signifies that there are no counter examples. The 2 in the function check2 means ``check a property of two arguments''.

Example 2: Generating valid inputs to a binary multiplexor

Let's take an excerpt from the CircLib module of the Reduceron II. In particular, we'll take the following binary multiplexor circuit binMux which selects the nth bit-vector (where n is binary encoded number) from a list of bit-vectors:

  binMux :: [Bit] -> [[Bit]] -> [Bit]
  binMux n xs = select (decode n) xs

  select :: [Bit] -> [[Bit]] -> [Bit]
  select sel xs = map (tree (||))
                $ transpose
                $ zipWith (\s x -> map (s &&) x) sel xs

  decode :: [Bit] -> [Bit]
  decode []     = [True]
  decode [x]    = [not x,x]
  decode (x:xs) = concatMap (\y -> [not x && y,x && y]) (decode xs)
If this circuit is correct, then the following property should hold:
  prop_binMux n xs = length xs == 2 ^ length n
                       && all ((== length (head xs)) . length) xs
                         ==> binMux n xs == xs !! num n
In other words, binMux n xs should have the same behaviour as xs !! num n (where num turns a bit-vector into an Haskell integer) whenever xs has length two-to-the-power-of the length of n and all bit-vectors in xs have the same length. The property prop_binMux is a prime example of a property with a sparse domain. Using SparseCheck, we reformulate the antecedent as a logic programming predicate:
  cond_binMux :: Term [a] -> Term [[b]] -> Pred
  cond_binMux sel xs = exists $ \(m, n, o) ->
                           len sel m
                         & len xs n
                         & pow (i 2) m n
                         & forall xs (\x -> len x o)
Here, len a b relates the list a to its length b; pow a b c relates c to a to-the-power-of b; and forall a p states that the predicate p holds on every element of the list a. All of these relations are exported by the SparseCheck module. As an example, here is the definition of forall:
  forall :: Term [a] -> (Term a -> Pred) -> Pred
  forall as p = caseOf as alts where
                  alts (a, as) =
                        nil :-> true
                    :|: (a |> as) :-> p a & forall as p
The consequent of our property looks the same as before
  prop_binMux :: [Bit] -> [[Bit]] -> Bool
  prop_binMux n xs = binMux n xs == xs !! num sel
and the whole property can now be easily checked
  Hugs> check2 3 cond_binMux prop_binMux
  []
How much better is our custom generator than a standard SmallCheck or QuickCheck generator? Well, after 2.5 minutes SmallCheck generates 186279409 inputs, but 186277572 do not satisfy the antecedent. After 0.05 seconds, SparseCheck generates all 1837 of these valid inputs. It is likely that QuickCheck will also generate vastly more invalid inputs than valid ones.

Example 3: Generating valid chess boards

Let's take a excerpt from the Mate-in-N program by Colin Runciman. This program takes a chess position and a number n as input and gives a sequence of n moves that forces checkmate, if such a sequence exists. Suppose that we wish to verify the following property:
  prop_checkmate b =
    validBoard b ==>
      if   length white == 2 && Pawn `elem` white
      then not (checkmate Black b)
      else True
    where
      white = map fst (forcesColoured White b)
In words, this property asserts that on a valid chess board in which white has only 2 pieces, one of which is a pawn, black cannot be in checkmate. Using QuickCheck, and setting maxFail to ten million, we get the following after about ten minutes:
  OK, passed 100 tests.
QuickCheck is having difficulty satisfying the validBoard predicate -- generating and checking only a hundred valid boards in ten minutes. Using SmallCheck at depth four we get the following after about thirty minutes
  Depth 0:
    Completed 0 test(s) without failure.
  Depth 1:
    Completed 1 test(s) without failure.
    But 1 did not meet ==> condition.
  Depth 2:
    Completed 49 test(s) without failure.
    But 49 did not meet ==> condition.
  Depth 3:
    Completed 143641 test(s) without failure.
    But 143641 did not meet ==> condition.
  Depth 4:
    Completed 3232036201 test(s) without failure.
    But 3232036201 did not meet ==> condition.
Clearly SmallCheck is also having difficulty satisfying validBoard. Let's take a look at how validBoard is defined:
  validBoard (Board white black) =
       exactlyOne King (map fst white)
    && exactlyOne King (map fst black)
    && all (onboard . snd) (white ++ black)
    && allDiff (map snd white ++ map snd black)
    && kingsDontTouch white black
In words, for a board to be valid: (1) each side must contain exactly one king; (2) every piece must be on the board (i.e. have coordinates that lie in an 8x8 box); (3) every piece must be on a different square; and (4) kings must not touch each other. We can use SparseCheck to define this predicate:
  validBoard b =
    exists $ \(ws :- bs :- wps :- bps :- wsqs :- bsqs :- sqs) ->
        b === board ws bs
      & append wsqs bsqs sqs
      & zipP wps wsqs ws
      & zipP bps bsqs bs
      & exactlyOne king wps
      & exactlyOne king bps
      & kingsDontTouch ws bs
      & forall sqs onBoard
      & allDiff sqs
To illustrate the style a bit further, here is how the exactlyOne predicate is defined:
  exactlyOne a bs =
    caseOf bs alts where
      alts (b, bs) =
            nil :-> false
        :|: (b |> bs) :-> (a === b & forall bs (=/= a))
                        ? (a =/= b & exactlyOne a bs)
Notice that SparseCheck provides generic unification (===) and disequality (=/=) predicates. Disequality is implemented using residuation, i.e. diseqality constraints get placed in a constraint store and are checked only when the variables being constrained are instantiated. General purpose predicates can be evaluated by residuation, but that is a topic for another day.

If we now remove the antecedent from our original property

  prop_checkmate b = if   length white == 2 && Pawn `elem` white
                     then not (checkmate Black b)
                     else True
    where
      white = map fst (forcesColoured White b)
and generate inputs to it using our new validBoard predicate
  main = print $ head $ check 8 validBoard prop_checkmate
then we get the following counter example after a few seconds.
  Board [(King,(3,2)),(Pawn,(2,1))] [(Queen,(1,3)),(King,(1,2)),(Bishop,(1,1))]
Chess players will not be surpised to see that a king and pawn are sufficient to checkmate the opponent, even when the opponent has a queen and a bishop.

Current limitations

Currently SparseCheck is more of a research experiment and a demonstration of our logic programming library than a final product. One limitation is completeness. SparseCheck should really provide predicates corresponding to most commonly-used Prelude functions, but currently only a few are provided. Another limitation is efficiency. SparseCheck has been written for simplicity and could certainly be much more efficient.

Obtaining SparseCheck

SparseCheck can be obtained by typing
  darcs get http://www.cs.york.ac.uk/fp/darcs/sparsecheck
If you use GHC 6.6 or later you'll need to compile with the -fno-mono-pat-binds option. Polymorphic pattern bindings are supported by default in standard Haskell but GHC disables them for a reason that I don't understand. Polymorphic pattern bindings are not vital to the success of SparseCheck, but they do make the creation of logical data types somewhat nicer.

Acknowledgements

SparseCheck is based on a library called LP which was developed jointly with Emil Axelsson. Much motivation for the SparseCheck experiment came from Fredrik Lindblad's observation that antecedents can be used to direct test-data generation, as discussed in his excellent paper Property-directed generation of first-order test data. Some of the terminology used here, particularly the use of word ``sparse'', was copied from slides by Simon Peyton Jones.