SS > book reviews > Keith Devlin

Why do we need a book about logic and information? Don’t we have a theory of information – Shannon’s? Well, yes, but a theory is good for what it’s designed for, and need not be good for other areas. Shannon’s theory is great if you are designing transmission codes for lossy communication channels. It’s not so good if you want to understand how people communicate with each other. But doesn’t good old-fashioned propositional logic cover that? Well, no. Consider the example sentence Devlin uses. Nancy says to John: “It’s cold in here”, with the intent to get John to close the window. Propositional logic doesn’t hack it. Devlin’s book uses situation theory, in order to formalise properties of information as it occurs in everyday discourse.

p217. … I regard the situatedness of language use as of fundamental importance. …. In natural language use, all sorts of contextual factors are relevant, and it is possible to utter the same sentence in different circumstances to convey quite different information.

He starts off with some discussion of the difference between perception of information about the world, and cognition related to that perception. This cognition doesn’t have to be high level intelligence: even a thermostat can do it to some degree:

p19. Notice that the process of cognition (i.e. at the very least an analog-digital conversion) involves a loss (often a huge loss) of information. But this is compensated for by a very definite gain, in that there occurs a classification of the perceived information. A thermometer simply registers the temperature (analog coding of information) and hence is a perceiving agent but not a cognitive agent (in even the most minimal sense suggested above); the thermostat classifies the temperature into two classes (‘warm’ and ‘cold’), and thus exhibits a form of cognition.

He starts building his theory from some “givens”: a preexisting ontology, and that the agents (that are transmitting and receiving information) are capable of “individuating” things from this ontology when they occur in the world.

p112. Our theory is built upon an initial collection of individuals, locations, properties, relations, and situations. Since the exact collection of such basics will depend upon the agent, or possibly the species of agent, our theory remains largely agnostic as to what is and what is not taken as an individual, location, property, relation, or situation. Rather we simply assume at the outset, the existence of a scheme of individuation that provides us with this ontology.

I wonder if this is somewhat begging the question. If the theory is built on the given agent ontology, how can agents with different ontologies communicate? I am reminded of the William James quote “Other minds, other worlds from the same monotonous and inexpressive chaos!”. Still, maybe starting with a theory of matched ontologies helps, and we can wait for communication with alien ontologies until version 2.

This ontology includes situations (which are something akin to the context within which the information exists). These situations can be individuated by the agents and manipulated in the theory just like other things in the ontology.

p31. within our theory situations are regarded as first-class members of the ontology, alongside the individuals, relations, locations, … and it is really this move that distinguishes situation theory from other theories that take account of context or other environmental effects. In particular, situations are allowed to fill (appropriate) argument roles of relations

Situations are a new kind of thing, different from the other kinds of things in the ontology. We might feel we understand individuals, locations, properties, relations, but just what are situations? Well, what they are is defined by how they behave. They are defined by the theory. Mathematically, this

p73. … requires a definite commitment to dealing with entities that can only be understood in terms of the system that thereby results.

p70. Situations are abstract objects in their own right, distinct from all the other entities in the ontology … and some questions that are meaningful when asked of, say, individuals, locations, or sets, are simply not appropriate for situations, … In particular, it is in general not appropriate to ask for an ‘extensional definition’ of a situation

Although he puts it in “scare quotes”, Devlin assumes that the reader knows what ‘extensional definition’ means. It means defining something by listing everything about it; the term is typically used of set definitions, where an extensional definition explicitly lists the members: {Tweety, (1, Rover), 42}. This contrasts with an ‘intensional definition’, given in terms of some membership property, for example, the set of “blue birds” (the members are all things that are blue and are birds) or “dangerous situations”. (Paulos, in his 1998 book Once Upon a Number, notes the dangerous things that can happen if these kinds of definitions get muddled in our minds. He also talks about situational logic.)

Since situations can’t in general be defined extensionally, it means there is always potentially something more to the situation: it is open. So agents necessarily have only partial (usually very partial) information about a situation.

p71. we must accept that these new abstract objects might not behave as do the ones more familiar to us. The agent will in general only have partial information about a given situation, and hence can only access or talk about that ‘situation’ by some intensional, roundabout means

I think the reason the term “roundabout means” is used here is that the intensional definition of the situation seems to be something like “all information that is about the situation”.

Devlin also has something to say about modelling, about being careful to distinguish properties of the model from those that are merely of the modelling techniques.

p75. any model invariably brings with it features not only of the thing being modeled, but also the structure in which the modeling is done, and it is often quite difficult to decide if a particular aspect of the model says something about the thing being modeled or the theory providing the model. A particular example of this within situation theory itself is provided by early attempts of Barwise and Perry in [5] to model situations in classical set theory. The well-foundedness axiom of set theory, in particular, led to a number of difficulties in trying to understand situations, that turned out to be not issues concerning situations at all, but rather were a feature of that particular means of modeling.

The [5] in the quotation refers to the 1983 book Situations and Attitudes. Barwise and Moss make similar points about modelling (in particular, just because you can model something as a set doesn’t mean that it is a set) in their 1996 book Vicious Circles, which is all about non-well-founded set theory.

Devlin then set up the machinery for defining information in situations. He has some unfortunate examples, though. Take:

p113. Suppose I utter the sentence

Jon sold the house.

This seems perfectly informational. Hearing me, and, let us suppose, being aware who Jon is, you learn something about this individual. … the fact that Jon had a house for which he had no further need, the fact that someone else wanted to buy this house, and so on, ….

It doesn’t seem perfectly informational to me, however. Reading this, I immediately ask “Is Jon the owner, or is he selling it as an estate agent [realtor]?” Indeed, is Jon the legitimate owner, or has he sold it fraudulently? (Compare “Jon sold the Brooklyn Bridge”.) These ambiguities can probably be resolved by the situational context. But here the sentence is used as a purely propositional example. [Or maybe the problem is just me reading everything with “Science Fiction reading protocols”.]

The theory uses relations between things (such as the relation “sells”, with slots for seller, purchaser, sold item, price, location, and time). One thing I found rather puzzling was the insistence on a fixed number of slots in a given relation. This seems overly restrictive, give the “open” nature of the situation within which the sale occurs. What if it becomes important to know something else about the sale? What is the “location” of the sale if it occurred on eBay? What is the “time” if the seller and buyer are in different time zones? etc, etc. However, since this fixed number of slots does not seem to become crucial in anything that follows, let’s move on.

An important part of the theory is that the agent need not be a powerful reasoner. There can be a logic that the theorist, peering at the agent and its discourse, uses to describe the system. And there can be a lesser theory used to describe the agent’s view of the system. And, harking back to the model not being the reality, there is also the process that the agent is actually using (as opposed to the theorist’s model of the agent’s logic).

p128. there is no reason to suppose that the theorist’s account of what justifies an inference made by an agent, that is to say, the agent logic, should at all resemble the manner in which the agent actually performs that inference. That is to say, there is, in general, no reason for supposing that the agent itself uses the agent logic. Rather, what we are calling the agent logic is just that logic we, as theorists, attribute to that agent in order to provide an account (from the outside) of its information processing mechanisms. What is required is that the internal ‘logic’ employed by the agent, and the agent logic as stipulated by the theorist, both be consistent with the way the world is.

This emphasis on the modelling approach potentially leaking through to the model of the system, made me wonder about:

p139. Another important distinction is that, for a type T, the truth or falsity of the proposition

a:T

for a given object a, is determinate; that is to say, it is a fact of the world that either a is of type T or a is not of type T.

Consider the type T being “tall people”, with the corresponding defining property of “being tall”. Where is this sharp boundary between “being tall” and “not being tall”? This is a classic example of “fuzzy logic”: there are actually no crisp boundaries between some of our classifications. Our ontology is more subtle than the one considered here.

Having set up the infrastructure to capture “information” in a situated manner, Devlin moves on to an important feature of communicating agents: the information in their mental states. Some mental states are intentional: states that are “about” something in the world. These include beliefs (mental states about what you think the state of the world is), desires (mental state about how you would like the state of the world to be), and intents (mental states about what you are going to do to the state of the world). That these are different is shown by what happens if you discover your mental state and the state of the world do not correspond: if it is a belief, then your belief is mistaken and you change the belief to match the world; if it is a desire, you might (form an intent to) change the world to match the desire.

Note the words used here: “intentional” and “intent”, as well as the previous “intensional”. These similar-looking words can be confusing:

p207. (i) intentionality as that property of mental states by which they are directed at or of or about the world …
(ii) intention in the sense of intending to perform some act;
(iii) intensionality, spelt with an ‘s’, … the opposite of extensionality, … property-dependent as opposed to extensionally determined …

As well as these mental intentional states, agents have perceptual experiences. Things start to get interesting when formalising such experiences.

p196. the significant feature of the definition [of visual experience] is that it is self-referential. The visual experience, v, figures in its own external content: it is part of the external content of a visual experience, that it is caused by what is seen. It is this self-referentiality that comprises the immediacy of seeing.

Devlin doesn’t go into much detail here, and I’m not sure if this “immediacy” is supposed to be an elliptical reference to something like qualia. I’ve always found the notion of qualia vacuous at best and incoherent at worst. But this might lead to some kind of coherent definition of qualia: something to do with self-reference.

The structure of an utterance (like “It’s cold in here”) involves many potentially different situations, all of which contribute to its meaning:

pp218-20.First of all there is what I shall call the utterance situation. This is … the situation or context in which the utterance is made and received. …
     In many cases, the utterance is part of an ongoing discourse situation …
     The discourse situation is part of a larger, embedding situation that incorporates that part of the world of direct relevance to the discourse …
     Next there is … a resource situation. … Resource situations can be part of the embedding situation, but they need not be. … [They can be] the objects of some common knowledge about the world or some situation
     Finally, there is the described situation, that part of the world the utterance is about.

This made me wonder about the Amazonian people described in Don’t Sleep, There are Snakes – their language has a restricted grammar, and they seem to make use of very limited (only immediate) resource situations. Is there a connection, I wonder?

Discourse isn’t just about uttering propositions. Utterances can change the state of the world (particularly, the mental state of the listener): they have an impact.

p242. the impact of an utterance is the (relevant) change in the embedding situation that the utterance brings about.

Furthermore, the speaker has an intention to go with their utterance. Speakers don’t just convey propositional information (and indeed, they may lie or otherwise intentionally deceive). They can also also speak “indirectly”, by using a proposition to convey a request (saying “It’s cold in here”, with the intent to get the hearer to close the window). The intended communication is the request, not the proposition.

p283. An utterance u is said to be successful if … all the speaker’s intentions … become part of the impact of the utterance.

This covers most of the theory. There is a short section towards the end on the Liar paradox (“this sentence is false”, which in situational theory becomes “the proposition expressed by this utterance is false”). Situation theory provides a resolution to this paradox. It lies in difference between two negations of “situation s supports information t” /“t is a fact in situation s”; one is “situation s supports information not-t”/“not-t is a fact in situation s” (it is the case that the negation of t is a fact in s) and the other is “situation s does not supports information t”/“t is not a fact in situation s” (situation s may have nothing to say about t). A consequence of this is that the “world”, or what one might like to think of as the “universal situation”, is not in fact a well-formed situation at all. (This is gone into in a lot more detail in Barwise and Etchemendy’s book The Liar).

Finally Devlin winds up the book by discussing whether his work constitutes “mathematics”.

p295. As yet there is no science of information of the kind envisaged in this book. It remains a goal to be achieved at some later date. Likewise the mathematical theory presented here is still in, at best, an embryonic form. Indeed, some might say it is not mathematics at all. It depends, of course, on what you mean by ‘mathematics’.
If your definition of mathematics is theorem-proving, then indeed what you find between the covers of this book is not mathematics. But that is to take an extremely narrow view of the rich and vibrant field that I see as constituting mathematics. ….
It is often said that mathematics is the study of patterns. And this is what I have been trying to do in this book: identify and mathematicize the relevant abstract patterns that are important in a scientific study of meaning and information.

I think it just qualifies mathematics, in that it considers deeply subtle distinctions between concepts, and provides a framework for formalising them. It is very important to think deeply about such issues, and tease apart the relevant concepts. And the philosophical discussions throughout the book on how things should be formalised is worth the entrance price alone. What I would also like is the ability to do calculations. (This is a weaker requirement than the ability to prove theorems: theorem are generic, calculations are specific, but they both need a semantics.) Specifically, if I have certain information, what can I deduce? If I know another agent has certain information, what can I deduce about what they can deduce? This, I think, requires more machinery than Devlin has included so far, possibly including some consideration of the computational resource needed to perform the inference. And just how does the propositional utterance “It’s cold in here” succeed in getting the hearer to recognise that they have been asked to (form the intent to) close the window? I suspect this may require the addition of some formalisation of something like Gricean maxims or Relevance theory.

And, to wrap up my review, I quote the lovely little rant by Devlin against short-sighted research relevance policies, as he had to leave the UK to follow this line of research:

p.viii. Should it ever come about (and I think it will) that some of the ideas developed in these pages turn out to be of real ‘use’, I would hope that this book serves as a testament to the stupidity, even in those very terms of ‘usefulness’ that were foisted on the British university system, of judging any intellectual pursuit in terms of its immediate cash value.

Indeed. And I have certainly found the work of use. I think there are two directions to go now. There is Devlin’s later book, Goodbye, Descartes, which appears to be a more “popular” account of some of the ideas here, and more. And there is Barwise and Seligman’s Information Flow, which appears to be a related push to further situation theory in a more computational setting. Hopefully someday I’ll get around to reading these, too.

For a long time, pencil and paper were considered the only tools needed by a mathematician (some might add the waste basket). As in many other areas, computers play an increasingly important role in mathematics and have vastly expanded and legitimized the role of experimentation in mathematics. How can a mathematician use a computer as a tool? What about as more than just a tool, but as a collaborator?

Keith Devlin and Jonathan Borwein, two well-known mathematicians with expertise in different mathematical specialities but with a common interest in experimentation in mathematics, have joined forces to create this introduction to experimental mathematics. They cover a variety of topics and examples to give the reader a good sense of the current state of play in the rapidly growing new field of experimental mathematics.

The discipline of mathematics has many aspects. There is the process of exploration, “messing around”, spotting patterns, suggesting hypotheses and conjectures. Then there is the process of proof: going from the initial statement to the fully proven theorem, which in non-trivial cases will also involve significant elements of exploration. Finally, there is the theorem itself, a statement of mathematical fact, backed up by a rock-solid argument of the tidied-up proof. Often though, only that final proof, and none of the processes leading up to it, is made public. This can lead to an external view of mathematics as magical results of genius, rather than a process of creative hard graft.

In my discipline of computer science, there is some support for the process of proof, with a computer helping to make sure the steps really are rock solid, and not instead built on sand. Unfortunately, the computer is a super-pedant, even more so that the most nit-picking of mathematicians, and nothing but the most adamantine of rock will do for it. Valid short cuts such as “without loss of generality”, “by symmetry”, and “abusing the notation” are not allowed, making these tools somewhat exasperating to use at times (although they are improving all the time).

But what about that initial process of exploration? Experimental Mathematics augments that stage: using a computer to help “messing around”, taking away (some of) the tedium of exploratory calculations, providing more examples and help with pattern discovery.

This book provides an introduction to such Experimental Mathematics, with several examples, and suggested exercises if you feel inclined to pursue some of the ideas further. Many of the examples are of the form:

Have a mathematical expression such as a series formula, or definite integral, from somewhere (maybe a previous round of exploration)
Use a computer to get a good numerical approximation to the expression, eg, 3.1462643699419723423
Use a program to suggest a possible closed form representation of that number, eg, in this case, √2 + √3
Use that closed form as a conjecture for the actual value of the expression, and try to prove it (which may include further experimental mathematics)

There are several examples of increasing complexity, covering sequences, series, integrals, and more, with discussion of the discoveries of the results and their proofs. There is discussion about the famous formula that allows the calculation of the nth binary digit of π, without needing to calculate all the preceding n–1 digits. I also learned of “spigot” algorithms to calculate numerical values, which produce the digits one by one, cutting down on memory requirements.

The book definitely gives a flavour of the overall process, with enough pointers to suitable software resources for the reader to take up some of the challenges. However, I was left feeling that the flavour was somewhat weak. Okay, now we can calculate π to a bazillion digits, or find its gazillionth (binary) digit without calculating the earlier ones, and we have proofs of lots of closed form solutions. And yet, I didn’t come away with a feeling of much mathematical depth in these results. At most, in some cases the process suggests links between seemingly unrelated expressions. But I was hoping for more significant results. For example, the book finishes with a short chapter on “discovery by visualisation”, an area where the computer surely reigns supreme, yet surprisingly little is made of this. Maybe I am simply unaware of a vast existing industry in mathematics that is about proving such seemingly-pedestrian but actually important results, which has now become much more automated and hence more productive?

author : Keith Devlin

Books

Short works

Books : reviews

Keith Devlin.
Logic and Information.
CUP. 1991

Keith Devlin.
The Joy of Sets: fundamentals of contemporary set theory: 2nd edn.
Springer. 1993

Keith Devlin.
All the Math That's Fit to Print: articles from the Manchester Guardian.
Mathematical Association of America. 1994

Keith Devlin.
Goodbye, Descartes: the end of logic and the search for a new cosmology of the mind.
John Wiley. 1997

Keith Devlin.
The Maths Gene.
Weidenfeld and Nicholson. 2000

Jonathan M. Borwein, Keith Devlin.
The Computer as Crucible: an introduction to experimental mathematics.
A K Peters. 2009

Keith Devlin.
Introduction to Mathematical Thinking.
Keith Devlin. 2012

author : Keith Devlin

Books

Short works

Books : reviews

Keith Devlin. Logic and Information. CUP. 1991

Keith Devlin. The Joy of Sets: fundamentals of contemporary set theory: 2nd edn. Springer. 1993

Keith Devlin. All the Math That's Fit to Print: articles from the Manchester Guardian. Mathematical Association of America. 1994

Keith Devlin. Goodbye, Descartes: the end of logic and the search for a new cosmology of the mind. John Wiley. 1997

Keith Devlin. The Maths Gene. Weidenfeld and Nicholson. 2000

Jonathan M. Borwein, Keith Devlin. The Computer as Crucible: an introduction to experimental mathematics. A K Peters. 2009

Keith Devlin. Introduction to Mathematical Thinking. Keith Devlin. 2012