*Situations and Attitudes*. 1983, with John Perry1987, with John Etchemendy*The Liar*.1996, with Lawrence Moss*Vicious Circles*.*Information Flow*. 1997, with Jerry Seligman

In this book two outstanding scholars,
whose complementary backgrounds and perspectives provide a depth of experience and competence in logic and philosophy of language,
unfold a highly readable, witty, and clear treatment of important innovations in the field of natural semantics.
Their thesis is that the standard view of logic (as derived from Frege, Russell, and work in mathematics and logic)
is inappropriate for many of the uses to which it has been put by philosophers, linguists, computer scientists, and others.
Instead they provide the basics of a realistic model-theoretic semanatics of natural language.
Avoiding technical details, they seek to explain the main ideas of the theory, contrasting them with competing theories.

The Liar paradox is an old chestnut: consider "This sentence is
false". The naive analysis runs thus: assume it is true, then it
has to be what it says, that is false, a contradiction; so, assume it
is false, then it actually is what it says, that is true, also a
contradiction! How to solve this? The classical mathematical approach
is to ban such self-referential sentences. But that throws the baby
out with the bathwater: there's nothing *paradoxical* about "This
sentence has five words" or even about "This sentence has
one hundred words".

*The Liar* is about an approach to solving the paradox, rather
than simply banning it. The usual way to analyse such cases is to
build a mathematical model in
set theory, then use it to define and analyse the truth values of
constructs modelled this way. In the case of the Liar paradox, this
involves modelling self-referential propositions. But classical (ZF)
set theory is formulated to avoid self-referential sets: the
Axiom
of Foundation is there to prevent it. Sets have members: these
members can themselves be sets, with members of their own. What the
Axiom of Foundation says is that if you follow this membership
relation downwards, you always eventually get to the "bottom":
atomic members that are not sets (or are the empty set, which has no
members, sets or otherwise). This means there are no infinitely
descending chains of membership (it *isn't* "turtles
all the way down"), and there are no circular membership
relations (a set cannot be a member of itself, which is what makes it
hard to model self-reference).

So, when your mathematics isn't up to the job -- use different
mathematics! In this case, the authors use Aczel's brand of *nonwellfounded
set theory* as a basis for building their models (despite what it
might sound like from its name, it is a perfectly well-defined and
consistent mathematical theory). In chapter 3, the authors summarise
this theory in enough detail to understand how it is being used in
their subsequent modelling of the paradox. The approach has a visual
representation, in modelling sets as graphs (of the membership
relation): wellfounded sets must have acyclic graphs; nonwellfounded
sets can have cycles in their graphs. This give a nice intuition for
what's going on, and the explanations have a good mix of English text
and mathematical rigour. It can sometimes be a bit confusing, however.
For example:

p39.
We said that on Aczel's conception a set
is any collection of objects whose hereditary membership relation can
be pictured by a graph. More precisely, a *graph G* is a set of
nodes and directed edges, as usual.

Here "Aczel's conception of a set" refers to these nonwellfounded sets (or hypersets), now defined in terms of graphs. A graph is "a set of nodes". What kind of set are these nodes? Wellfounded? Nonwellfounded (relying on a circular definition)? Does it make a difference?

(A very minor problem with the exposition is due to the example
atomic members chosen. On p40 we get the equation
*a* = {Max, *a*}. I
had a moment's confusion, trying to work out what was being maximised,
before I remembered that the atoms in the example language include the
authors' children's names Max and Claire. Moral of this tale: if you
are a logician, do not name your children after mathematical
functions!)

Now, these new hypersets look strange. In fact, they initially look so counter-intuitive that they must be "wrong". But that's because we have been brought up on wellfounded set theory, with its assumptions now bedded into our intuition. We have "got used to it". But we can get used to nonwellfounded sets, too:

p58.
The introduction of a new sort of
mathematical object has always met with considerable resistance,
including such now mundane objects as zero, the negative numbers, the
irrationals, the imaginary numbers and infinitesimals. We realize that
some set theorists feel a similar reluctance to admit hypersets as
legitimate mathematical objects. While this reluctance is perhaps
understandable, it is also somewhat ironic. After all, many set
theorists prior to Zermelo were working with a conception which
admitted circularity, as is apparent from the formulation of Russell's
paradox. Furthermore, the axiom of foundation has played almost no
role in mathematics outside of set theory itself. We must admit,
though, that we initially shared this reluctance, having been raised
within the Zermelo tradition. But our own experience has convinced us
that those who take the trouble to master the techniques provided by
AFA will quickly feel at home in the universe of hypersets, and find
important and interesting applications.

So, once they have a mathematical toolkit up to the job, the authors
go ahead with a traditional approach: use this set theory to model the
various self-referential sentences, statements and propositions; give
this a semantics or two; analyse the resulting systems. They analyse
the system in two different ways, which they call "Russellian"
and "Austinian". (They emphasise that these are not actually
the approaches that Russell and
Austin advocated, but that they are
in the spirit of their approaches.) The analyses give different
answers. (What, you wanted *the* answer? But why are you
surprised that the answer depends on how you formulate the question?)

Summarising brutally, and inevitably misleadingly, the analyses run as follows.

The Russellian analysis rests on a subtle distinction between denial
and negation. Negation is a "positive" statement: it states
that there are facts of the world that make proposition *p*
false. Denial is a "negative" statement: it denies that
there are facts that make *p* true. And these are not (in this
formulation) the same thing. (p79.the
fact of it being false is not a fact of the world). The
analysis shows that the naive formulation conflates these two.

The Austinian analysis rests on the approach that propositions are
made in the context of *situations*, and can have different
truth values in different situations. The analysis shows that the
naive formulation confuses different situations. It takes the form of
"diagonalisation" argument: assume you know all the facts of
the world, then construct a new fact that is true, but is not in your
original set.

p171.
Paradoxes in any domain are important:
they force us to make explicit assumptions usually left implicit, and
to test those assumptions in limiting cases. What's more, a common
thread runs through the solution of many of the well-known paradoxes,
namely, the uncovering of some hidden parameter, a parameter whose
value shifts during the reasoning that leads to the paradox.

Whenever one encounters an apparent incoherence in the world, a natural thing to look for is some implicit parameter that is changing values.

Whenever one encounters an apparent incoherence in the world, a natural thing to look for is some implicit parameter that is changing values.

p173.
There is a hidden parameter in the
Russellian account which the Austinian diagnosis makes explicit: the
portion of the world that the proposition is about. The Russellian
assumes that this "portion" encompasses the world in its
entirety. ...
the Liar paradox shows that we cannot in general make statements about
the universe of all facts. If we held on to the Russellian view of
propositions, the Liar would then force us to acknowledge an essential
partiality in the world: there are propositions which aren't true, but
whose falsehood somehow lies outside the universe of facts, outside
the "world."

This is all fascinating and informative. As usual, naive analysis breaks down at extreme or "edge" cases. What we have here is a thorough analysis of these edges that does not shirk the problem by simply banning it, but takes it seriously, and applies a mathematical approach that fits the problem, thereby shedding light, and uncovering assumptions.

Despite it being interesting, I didn't initially set out to read this to discover more about the Liar paradox: I read it to find out more about nonwellfounded set theory. That is because a delegate's throwaway remark during a conference made me wonder if it might be useful for thinking about emergent properties. I hadn't previously come across it (despite the fact that ideas like bisimulation, and some hairier branches of computer science, are apparently based on it, or equivalent to it). So I could have stopped reading after chapter 3, but it was too interesting! Anyway, I was heartened to see the pictorial approach, and even more heartened to see that much of my existing intuition would probably be fine:

p42.
While Aczel's is quite a different
conception from Zermelo's, it turns out that all the usual axioms of
ZFC are true under this conception, except, of course, the axiom of
foundation. This means that we can use all the familiar set-theoretic
operations (intersection, union, power set, ordered pairs, and so
forth) without any change whatsoever. Only when the axiom of
foundation enters (as with inductive definitions ...)
do we need to rethink things.

But that comment about induction is interesting. I'm glad I carried on reading, because in chapter 4 we get:

pp62-3.We
have given a coinductive definition ...,
taking it to be the *largest* collection satisfying the various
clauses. ...
note that every object that *can*
be included in the class *is* so included. Instead of working
from the bottom up, asking which objects are *forced into* the
defined class, we work from the top down, asking which objects are
legitimately *excluded*. This feature guarantees that circular
members ...
are not excluded.

The hairs on the back of my neck rose when I read that. It seems to
imply that the whole basis of *reductionism* is *wellfoundedness*.
(It might be true that the axiom of
foundation has played almost no role in mathematics outside of set
theory itself, but set theory has had an *enormous*
impact on the way scientists model the world.) In this view, we start
at the bottom, with the "atoms", and construct things
inductively. Everything not required (constructible this way) is
forbidden. Because wellfoundedness is so strongly entrenched, this
seems like the natural, maybe the only, way to do it. But
nonwellfoundedness *isn't like this*. There are nonwellfounded
things that *just cannot be built this way*: there are things
with no "bottom" or "beginning" (it *can* be
turtles all the way down, or all the way back in time, in this
model!); there are things that are intrinsically circular,
self-referential, chicken and
egg, strange loops even. Now
everything not forbidden is compulsory; now there is so much more room
to have the kind of things we need. But, what would it mean to have,
even to "construct", material things with such
nonwellfounded structure?

So, I'm off to read more about this, to see if it might be a better way to model emergent and self-organising systems than classical wellfounded set theory.

This is the book I came to after reading *The
Liar*, in the hope of finding out more about non-wellfounded
set theory, as the notions of circular reference seem important in a
theory of emergence. Circularity has a bad press in parts of
mathematics, where everything is based on (well-founded) set theory,
which bans circular reference. But the new theory removes that
restriction:

p5.
In certain circles, it has been thought
that there is a conflict between circular phenomena, on the one hand,
and mathematical rigor, on the other. This belief rests on two
assumptions. One is that anything mathematically rigorous must be
reducible to set theory. The other assumption is that the only
coherent conception of set precludes circularity. As a result of these
two assumptions, it is not uncommon to hear circular analyses of
philosophical, linguistic, or computational phenomena attacked on the
grounds that they conflict with one of the basic axioms of
mathematics. But both assumptions are mistaken and the attack is
groundless.

... Just because set theory*can*
model so many things does not, however, mean that the resulting models
are the best models. ... Knowing that things ... can be represented
faithfully in set theory does not mean that they *are* sets

... circularity is not in and of itself any reason to despair of using sets

... Just because set theory

... circularity is not in and of itself any reason to despair of using sets

One area of circularity is self-application, a thing computer
scientists are fond of doing. Here there is some discussion of
partial evaluation, compiler
compilers, and the relationship between them, resulting in
self-applicative formulae like [*mix*](*mix*,*mix*).
My question is, can this be extended to physical processes, like life,
and get round some of the problems Rosen
has modelling life using (well-founded) set theory? For
modelling autocatalytic sets of chemicals? For getting around the
other cyclic "chicken-and-egg" descriptions of the origin of
life? I didn't get an answer from this book, but I got more to think
about.

Circularity isn't only interesting for deep questions about life and
compiler compilers. It is helpful when *defining something in terms
of itself* is the most natural approach. The idea is to give such
definitions a meaning. There are some nice little example here, which
don't need to go anywhere near the full machinery developed. One
particularly cute one is on p53, asking the question of a sequence of
fair coin flips: what is the probability
that the first time we get heads it will be on an even-numbered flip?
and showing, using circularity (going through an argument and
returning to the original situation, so defining something in terms of
itself) that the answer is 1/3 (and hence, incidentally, that it is
possible to use a fair coin to make a three-way choice).

Another intriguing little sidebar is the relationship to game semantics.

p166.
games are known as Ehrenfeucht-Fraïsse
games. … we feel that they are so fundamental that everyone
should know about them.

To give an example, consider some sentence

(1) \( (\forall x) (\exists y) (\forall z) (\exists w) R(x,y,z,w) \)

about numbers. We can think about this assertion in terms of a game played by two players. In this context, the players are named \(\forall\) and \(\exists\) ... and the game works as follows: first \(\forall\) picks some number \(x\). After seeing this number, \(\exists\) responds with a number \(y\). After this, \(\forall\) picks \(z\). Finally, \(\exists\) gets to pick one last number \(w\). The sequence \((x, y,z, w)\) of choices constitutes one play of the game. The play is won by \(\exists\) if and only if \(R(x, y,z, w)\).

To give an example, consider some sentence

(1) \( (\forall x) (\exists y) (\forall z) (\exists w) R(x,y,z,w) \)

about numbers. We can think about this assertion in terms of a game played by two players. In this context, the players are named \(\forall\) and \(\exists\) ... and the game works as follows: first \(\forall\) picks some number \(x\). After seeing this number, \(\exists\) responds with a number \(y\). After this, \(\forall\) picks \(z\). Finally, \(\exists\) gets to pick one last number \(w\). The sequence \((x, y,z, w)\) of choices constitutes one play of the game. The play is won by \(\exists\) if and only if \(R(x, y,z, w)\).

(Ehrenfeucht-Fraïssé
games were first cast in this form in 1961.) This passage struck
a chord for me, from way back in the late 1970s, when I was doing my
undergraduate degree. In the calculus term of the "mathematics
for physicists" part of the course, many of the results were
posed as
Epsilon-Delta
definitions. Our lecturer explicitly called this style of
definition a *game*: "If I give you an \(\epsilon\) [ie, \(\forall \epsilon\) ],
you can always find a \(\delta\) [ie, \(\exists \delta\) ]
*such that* the property in question holds."

There are also helpful comments on the status of modelling (whether or not using set theory), that need to be remembered:

pp86-7.
Suppose that Jones likes to keep track
of birds which live near his seaside home. Each time he sees a bird,
he makes note of some feature which sets it apart from all the birds
which he has ever seen. When a gull with a cracked beak lands on his
porch, he can find no feature that sets it apart from a certain gull
with a cracked beak three weeks ago and described in his notes as
having a cracked beak. So he decides it is the same gull. Is this
belief wellfounded? probably not: there is no reason to suppose that
any feature will be found on just one bird.

This is just one of a great number situations in which some model (features) has lead someone to inadvertently identify two objects being modeled (birds), when they might in fact be distinct. This is a pervasive problem in mathematical modeling, something to be guarded against.

This is just one of a great number situations in which some model (features) has lead someone to inadvertently identify two objects being modeled (birds), when they might in fact be distinct. This is a pervasive problem in mathematical modeling, something to be guarded against.

The authors go on to discuss the Liar paradox from this perspective: Say this same Jones also models the Liar sentences, and discovers an identity that leads to the paradox. But the fact that [these models of the Liar sentences] are identical is evidence of an inadequacy of his modelling scheme. [This identity] shouldn't be regarded as a discovery about the meaning of [the Liar sentences]

The crucial difference between wellfounded and nonwellfounded sets (or, less perjoratively, hypersets) is:

p127.
if we are working with hypersets, then
every binary relation is isomorphic to the membership relation on some
set.

That is, any binary relationship graph is isomorphic to a (hyper)set
membership graph. With classical wellfounded sets, the binary
relations have to be acyclic (sets cannot contain themselves) and have
"bottom" elements (the set membership relation has to have
initial "atoms" that are not themselves sets). With
nonwellfounded sets, neither has to be true. The binary relation *R*
can have cycles (*a R a*) and can have chains with no "beginning"
(think of the successor relation over the integers, rather than just
the natural numbers). So they can define the hyperset of (infinite)
binary trees without the need to define any *leaves*, for
example.

A remark towards the end throws an interesting light on mathematicians and logicians:

p324.
... from a mathematical point of view,
much of [this book] could have been written at the beginning of the
[20th] century. .... So why are hypersets only coming into their own
now, after a hundred years of work in set theory?

... The main reason for the change in climate was that Aczel's theory, unlike the earlier work, was inspired by the need to model real phenomena. Aczel's own work grew out of his work trying to provide a set-theoretic model of Milner's calculus of communicating systems (CCS). This required him to come up with a richer conception of set.

... The main reason for the change in climate was that Aczel's theory, unlike the earlier work, was inspired by the need to model real phenomena. Aczel's own work grew out of his work trying to provide a set-theoretic model of Milner's calculus of communicating systems (CCS). This required him to come up with a richer conception of set.

Who would have thought that mathematicians were worried about
modelling real phenomena, as opposed to building elegant abstract
theories? Indeed, much of this book is written in a dense, pure
mathematical style: "Definition: Γ
is *smooth* iff Γ
is monotone, proper, and if almost all sets of urelements are very new
for Γ",
and so on. All the coalgebras and other mathematics are beyond my
level of mathematical knowledge, so I had to skim those parts, tying
to extract the message, if not the details: always dangerous with
mathematics! (And all the talk of smooth operators reminded me
irresistibly of that dastardly character
Curly Pi.) A few pages later,
they admit to not having written the book *I* wanted (not *their*
fault, of course!):

p328.
[corecursion theory of HF^{1}]
should relate to definability questions over [HF^{1}] as well
as to real programming issues, since we can certainly write circular
programs. We have not investigated this topic at all but consider it
extremely natural. We also did not take up issues like the
implementation of corecursion, or other ways in which the method would
be used in practice.

I wanted more on applications, how to use those hypersets to do
something interesting computationally. But the reference to Milner's
work, and also to the importance of the technique of bisimulation for
deciding if two hypersets are the same, at least points me in the
right direction *next*: I should look more at CCS.

So, I got a lot out of this book. But nowhere *near* as much
as it has in it! If I had, I would have certainly given it a
significantly higher rating.

Information is a central topic in computer science, cognitive science, and philosophy.
In spite of its importance in the “information age,”
there is no consensus on what information is,
what makes it possible, and what it means for one medium to carry information about another.
Drawing on ideas from mathematics, computer science, and philosophy,
this book addresses the structure of information and its place in society.

The authors, observing that information flow is possible only within a connected distribution system, provide a mathematically rigorous, philosophically sound foundation for a science of information. They illustrate their theory by applying it to a wide range of phenomena, from file transfer to DNA, from quantum mechanics to speech act theory. An important interdisciplinary text, this book is ideal for graduate students and researchers in mathematics, computer science, philosophy, linguistics, logic, and cognitive science.