Books

Books : reviews

Jon Barwise, Lawrence Moss.
Vicious Circles: on the mathematics of non-wellfounded phenomena.
CSLI. 1996

rating : 3.5 : worth reading
review : 15 October 2009

This is the book I came to after reading The Liar, in the hope of finding out more about non-wellfounded set theory, as the notions of circular reference seem important in a theory of emergence. Circularity has a bad press in parts of mathematics, where everything is based on (well-founded) set theory, which bans circular reference. But the new theory removes that restriction:

p5. In certain circles, it has been thought that there is a conflict between circular phenomena, on the one hand, and mathematical rigor, on the other. This belief rests on two assumptions. One is that anything mathematically rigorous must be reducible to set theory. The other assumption is that the only coherent conception of set precludes circularity. As a result of these two assumptions, it is not uncommon to hear circular analyses of philosophical, linguistic, or computational phenomena attacked on the grounds that they conflict with one of the basic axioms of mathematics. But both assumptions are mistaken and the attack is groundless.
     ... Just because set theory can model so many things does not, however, mean that the resulting models are the best models. ... Knowing that things ... can be represented faithfully in set theory does not mean that they are sets
     ... circularity is not in and of itself any reason to despair of using sets

One area of circularity is self-application, a thing computer scientists are fond of doing. Here there is some discussion of partial evaluation, compiler compilers, and the relationship between them, resulting in self-applicative formulae like [mix](mix,mix). My question is, can this be extended to physical processes, like life, and get round some of the problems Rosen has modelling life using (well-founded) set theory? For modelling autocatalytic sets of chemicals? For getting around the other cyclic "chicken-and-egg" descriptions of the origin of life? I didn't get an answer from this book, but I got more to think about.

Circularity isn't only interesting for deep questions about life and compiler compilers. It is helpful when defining something in terms of itself is the most natural approach. The idea is to give such definitions a meaning. There are some nice little example here, which don't need to go anywhere near the full machinery developed. One particularly cute one is on p53, asking the question of a sequence of fair coin flips: what is the probability that the first time we get heads it will be on an even-numbered flip? and showing, using circularity (going through an argument and returning to the original situation, so defining something in terms of itself) that the answer is 1/3 (and hence, incidentally, that it is possible to use a fair coin to make a three-way choice).

Another intriguing little sidebar is the relationship to game semantics.

p166. games are known as Ehrenfeucht-Fraïsse games. … we feel that they are so fundamental that everyone should know about them.
     To give an example, consider some sentence
(1)          \( (\forall x) (\exists y) (\forall z) (\exists w) R(x,y,z,w) \)
about numbers. We can think about this assertion in terms of a game played by two players. In this context, the players are named \(\forall\) and \(\exists\) ... and the game works as follows: first \(\forall\) picks some number \(x\). After seeing this number, \(\exists\) responds with a number \(y\). After this, \(\forall\) picks \(z\). Finally, \(\exists\) gets to pick one last number \(w\). The sequence \((x, y,z, w)\) of choices constitutes one play of the game. The play is won by \(\exists\) if and only if \(R(x, y,z, w)\).

(Ehrenfeucht-Fraïssé games were first cast in this form in 1961.) This passage struck a chord for me, from way back in the late 1970s, when I was doing my undergraduate degree. In the calculus term of the "mathematics for physicists" part of the course, many of the results were posed as Epsilon-Delta definitions. Our lecturer explicitly called this style of definition a game: "If I give you an \(\epsilon\) [ie, \(\forall \epsilon\) ], you can always find a \(\delta\) [ie, \(\exists \delta\) ] such that the property in question holds."

There are also helpful comments on the status of modelling (whether or not using set theory), that need to be remembered:

pp86-7. Suppose that Jones likes to keep track of birds which live near his seaside home. Each time he sees a bird, he makes note of some feature which sets it apart from all the birds which he has ever seen. When a gull with a cracked beak lands on his porch, he can find no feature that sets it apart from a certain gull with a cracked beak three weeks ago and described in his notes as having a cracked beak. So he decides it is the same gull. Is this belief wellfounded? probably not: there is no reason to suppose that any feature will be found on just one bird.
     This is just one of a great number situations in which some model (features) has lead someone to inadvertently identify two objects being modeled (birds), when they might in fact be distinct. This is a pervasive problem in mathematical modeling, something to be guarded against.

The authors go on to discuss the Liar paradox from this perspective: Say this same Jones also models the Liar sentences, and discovers an identity that leads to the paradox. But the fact that [these models of the Liar sentences] are identical is evidence of an inadequacy of his modelling scheme. [This identity] shouldn't be regarded as a discovery about the meaning of [the Liar sentences]

The crucial difference between wellfounded and nonwellfounded sets (or, less perjoratively, hypersets) is:

p127. if we are working with hypersets, then every binary relation is isomorphic to the membership relation on some set.

That is, any binary relationship graph is isomorphic to a (hyper)set membership graph. With classical wellfounded sets, the binary relations have to be acyclic (sets cannot contain themselves) and have "bottom" elements (the set membership relation has to have initial "atoms" that are not themselves sets). With nonwellfounded sets, neither has to be true. The binary relation R can have cycles (a R a) and can have chains with no "beginning" (think of the successor relation over the integers, rather than just the natural numbers). So they can define the hyperset of (infinite) binary trees without the need to define any leaves, for example.

A remark towards the end throws an interesting light on mathematicians and logicians:

p324. ... from a mathematical point of view, much of [this book] could have been written at the beginning of the [20th] century. .... So why are hypersets only coming into their own now, after a hundred years of work in set theory?
... The main reason for the change in climate was that Aczel's theory, unlike the earlier work, was inspired by the need to model real phenomena. Aczel's own work grew out of his work trying to provide a set-theoretic model of Milner's calculus of communicating systems (CCS). This required him to come up with a richer conception of set.

Who would have thought that mathematicians were worried about modelling real phenomena, as opposed to building elegant abstract theories? Indeed, much of this book is written in a dense, pure mathematical style: "Definition: Γ is smooth iff Γ is monotone, proper, and if almost all sets of urelements are very new for Γ", and so on. All the coalgebras and other mathematics are beyond my level of mathematical knowledge, so I had to skim those parts, tying to extract the message, if not the details: always dangerous with mathematics! (And all the talk of smooth operators reminded me irresistibly of that dastardly character Curly Pi.) A few pages later, they admit to not having written the book I wanted (not their fault, of course!):

p328. [corecursion theory of HF1] should relate to definability questions over [HF1] as well as to real programming issues, since we can certainly write circular programs. We have not investigated this topic at all but consider it extremely natural. We also did not take up issues like the implementation of corecursion, or other ways in which the method would be used in practice.

I wanted more on applications, how to use those hypersets to do something interesting computationally. But the reference to Milner's work, and also to the importance of the technique of bisimulation for deciding if two hypersets are the same, at least points me in the right direction next: I should look more at CCS.

So, I got a lot out of this book. But nowhere near as much as it has in it! If I had, I would have certainly given it a significantly higher rating.