This is a very densely written, and important book. It has taken me
several months to read, on and off, and I am sure I've missed some of
the points. But I have taken away much of interest. The theme is that
biology is not merely a consequence of physics, but has a
fundamentally important extra property: that of symbolic
relationships. The title comes from the idea of relationships: a
boat remains a boat if one (or indeed all) of its planks are replaced,
even with planks made from different material: it is not a function of
the objects that make it up, but of the relationships
between those objects.
What makes the Delphic boat float is the
nature of the relationship between its planks, not their
physicochemical nature. Whether they are made of oak or pine or
aluminum or steel is irrelevant to their function.
Our priority should be studying the
relationships that make up life, rather than remaining at the level of
the objects themselves; and we should do this by looking into the
nature of whatever it is that gives these relationships their
The "symbolic" part is that in biology, those
relationships can be arbitrary (for example,
genetic code). They are consistent with the laws of physics, of
course, but in some sense independent of them. The encoding, the
symbols used, could have been different, and so are not deducible
from, or reducible to, physics. Indeed, they have more in common with
ideas of computation, or information processing, and Danchin pursues
the relationship between cellular mechanisms and Turing Machines to a
The book is divided into five hefty chapters. The first two give the
biological background of the various genome sequencing achievements,
probably in more detail than you want unless you are a
genome-sequencer. I would probably have given up sometime before
chapter three (starting on p109), had I not been reading this because
I'd read a shorter, fascinating paper by Danchin, and wanted to delve
deeper. In summary, the first two chapters say that the genome is very
complicated, very detailed, very big, that it has structure, but that
structure is very messy, and that nearly everything you think you know
about it (from kindergarten "Ladybird" biology to
undergraduate studies) is untrue, by being highly and selectively
oversimplified. Then we get on to the interesting stuff. Not that it
gets any easier going, mind you. But stick with it; it's well worth
Danchin reiterates that biology cannot be understood and explained
in terms of outdated mechanical physical concepts.
life ... is not a mechanical process,
and that even if we do not deny its deterministic character, what we
can know about it does not enable us to predict its future.
Life is simply the one material process that has discovered that the
only way to deal with an unpredictable future is to be able to produce
the unexpected itself.
Physics is about identical objects, but by the time biology is
reached, distinguishability, identity, relationships become key.
Generally speaking, it is fairly easy to
build up a picture of the physical world, and to explain it in terms
of a combination of simple principles
, because physics is
concerned with reproducible objects that cannot normally be
distinguished from each other as individual entities. ...
Chemistry is more complex than physics,
and begins when atoms combine.
In chemistry, two individual examples of
the same object are usually indistinguishable when they are observed
under similar conditions. However, there is one particular
characteristic, quite rare in physics but almost universal in
chemistry, which clearly illustrates the importance of the
relationships between the parts that make up the object in question.
There are structures that are identical in every respect but their
symmetry, and the link between chemistry and biology was formed after
a distinct bias was observed in the symmetries of chemical products
produced by living things. ...
... It is impossible to distinguish
between two atoms of the same object, in the same state, but it is
possible to distinguish between two individuals of the same species. A
species is a population of individuals, a class of objects each with
its own identity. This is true even for microbes like bacteria: a look
at the way they swim will show that two individual bacteria, which
look the same and are genetically identical, can very often be
distinguished by their behavior. It is also true of cells, the "atoms"
or units of life.
Biology are characterised not just by the (individual) objects
involved, but by the relationships between those objects.
... in biology. No object exists in
isolation---or if such objects do exist, it is less important to know
them, because their isolation means that they have little to
contribute to the phenomenon being studied. It is precisely
relationships between objects that are at the heart of life. So we
know in advance that, among the things we need to discover, there are
relationships that have a particular form, whose implementation
enables vital functions to be expressed, such as the regulation of
gene expression. Of course we do not know exactly what these
relationships are a priori, but we know that they do exist. We
do not know what form they take, but we know that they demand a
certain proximity between objects, whether in terms of space or time
or other forms of mediation.
Once we realise that relationships are key, we can use this to make
progress. We can exploit the structure of the relationships to
investigate new possibilities, via (abstract) neighbourhoods of
Effective as it is, this
hypothetico-deductive method has the drawback of being able to refine
only knowledge that we already have, without giving us a way of
forming hypotheses that are both new and pertinent. How can we find
original ideas, but with an originality that is not alien to what we
are studying? ... How can we advance inductively, how can we explore
upstream, and not downstream as with deduction?
... We will consider only one approach,
because it is particularly effective in the case of genomes: that of
induction by exploring the neighborhood of the objects we want
to consider. The idea behind this approach is that each object exists
in relationship with other objects. ...
Here, Danchin takes the fundamental object to be the gene.
Neighbourhoods are then "similar" genes. Importantly,
similarity can be defined in a diverse range of abstract spaces. This
is where biological intuition and knowledge can pays off: by focussing
investigation in biologically "meaningful" such spaces.
Inductive exploration consists in
finding all the neighbors of each given gene, as a starting
"Neighbor" is to be understood here in the broadest
possible sense. It is not only a geometrical or structural notion.
Each neighborhood will have its own particular light to throw on the
gene of interest, and will provide clues for researching its function.
... One natural kind of neighborhood is proximity on the chromosome.
The evolution of species proceeds by
variation on ancestral themes. Consequently, many genes are descended
from common ancestors, and just as children look like their parents,
so genes, or more often their products, have points of resemblance.
This is a rewarding kind of neighborhood to consider. ...
There are many other ways of finding
neighborhood. In particular, a gene may have been studied by
researchers in laboratories all over the world. For one reason or
another, the gene may have properties that have made these researchers
associate them with other genes, so it is worth looking for a gene's
neighbors in the sense that it is mentioned in their company in the
scientific literature. ...
A gene's similarity with others can also
come from similar physicochemical characteristics of their products
... Similarities can be local rather than global, .... Similarity
might also be a matter of the absence, rather than the presence, of
certain motifs ... Giving free rein to the imagination can help us
discover other kinds of neighborhood ... Neighborhood can be
structural, if the products of different genes share the same cell
compartment ... But there are also kinds of functional neighborhood.
As the molecules involved in metabolism undergo interconversions,
there are enzymes that are neighbors because they use the same
substrate, produce the same product, or follow one another in a
Finally there are more complex kinds of
neighborhood, and studying these can bring particularly rewarding
results. To take up the example ... of bias in the use of the genetic
code, we find, for instance, that two genes can be neighbors because
they use the code in the same way. It is interesting to study all the
genes surrounding a given gene, in the cloud of points that describes
the use of the genetic code in all the genes in that organism. When
this is done, we begin to discover some very unexpected properties of
This emphasis on relationships requires experimental setups where
they can be investigated: setups where only the objects can be
investigated are inadequate. In particular, spatial and structural
relationships need to be considered. This has consequences:
the genome text and its meaning are
closely connected with an architecture, which is real even if it is
minuscule. One consequence of the domination of biology by
biochemistry, which favors the study of objects in isolation, has been
to encourage an image of the cell as a miniature test tube. In this
view, the concentration of molecules is seen as uniform, and the
standard thermodynamic approach is normally used to measure the course
of biochemical reactions, as if that were what happened in the cell.
But this is very misleading.
The existence of spatial cellular architecture has consequences on
the structure of the genome:
there is a map of the cell in the
chromosome. Genes are not randomly distributed in the genome text;
their position relates to their mode of expression, depending on the
nature of the environment, and to the location of their products in
the different cell compartments.
The existence of temporal cellular architecture also has important
Up to now I have spoken only of the
spatial organization of the cell, and of its very probable strong
connection with the spatial organization of the genome. But of course
we must add the time dimension to this. ... It takes a certain amount
of time to transcribe or to translate a gene. ... Clearly, adding a
section to be transcribed introduces a timing element, which can have
an important effect on the cell's dynamics, simply because of its
length, without the corresponding nucleotides' necessarily having
any particular meaning. ... comparison of related genomes should
reveal regions where the length is preserved, although the
sequence is not.
If this book were only about the importance of relationships and
functions, it would be interesting. But there is an additional crucial
component. Not only should we think in terms of functions, but there
is a level of indirection, a symbolic nature, to the
way the material objects represent the functionality.
life's exploration of reality has been
based on symbolic transposition. Unlike physical or chemical objects,
biological objects are more than just a site where actions occur; they
represent functions. Very often they no longer correspond to
them directly. ... The nucleic acids and the proteins, which
are the very foundation of the objects, relationships, and processes
that make up life, are made from completely different chemicals from
each other, and the DNA of a gene that codes for the synthesis of an
enzyme has absolutely no biochemical connection with that enzyme's
function or shape.
I would like to reemphasize the
arbitrary character of the association between a function and the
control of its expression. This is a first level of an aspect we
normally call "symbolic," when we are talking about human
communication. This arbitrary, symbolic character allows the cell to
manipulate associations situated at a high hierarchical level, between
apparently unrelated functions. Life has made systematic use of this
remarkable phenomenon. This is what makes it possible to introduce
relationships between physical parameters, as well as chemical ones,
into gene expression. ... This symbolic aspect is typical of the most
important biological functions.
This model evokes a way of
representing the world that is profoundly different from the way we
usually account for the physical world. It adds abstract symbolic
relationships to the objects of chemistry and physics. The difficulty
of understanding this symbolic aspect explains why biology in general
and what we call "molecular" biology in particular are the
subject of so much misinterpretation and misunderstanding.
It is this symbolic abstraction that makes life not deducible
from physics (although it is consistent with physics). It
is more than mere physics. He has disparaging things to say about the
current enthusiasm for self-organised complexity:
What governs life is
not outside physics---it respects all its laws; but a law such as the
genetic code cannot be simply an automatic consequence of the
laws of physics. This is what I am summarizing when I say that it
cannot be reduced to physics.
many modern thinkers
life to be in itself an unavoidable consequence of things. This
creates a very strong tendency to attempt to represent life not just
as a possible and predictable result, but as an inevitable, logically
derived consequence of the laws of physics. This reduction of life to
the physicochemical world has culminated in studies which postulate
more or less elaborate connections between various dynamics of simple
physical systems, and which are summed up by an expression that is as
fashionable as it is vague and inappropriate, self-organization.
By sheer tricks of language and abuse of metaphor, the authors of
these studies seek to "explain" life in terms of the complex
behavior of oscillating chemical reactions, or the spontaneous
appearance of organized structures on different levels. This painfully
reductionist attitude completely fails to recognize what is the basis
of life, symbolic abstraction. The objects that make
biological functions happen often have no mechanical relationship with
them; they are only their mediator, their symbol.
He really doesn't like complexity theory, and the physical
kind of dynamical systems resulting in catastrophes, bifurcations and
oscillations. It's not enough:
We can only be astonished that,
confronted by the marvelous variety and sheer gratuity of
insect forms, scientists have not more often been inspired to explore
the mode and timing of their production by starting from reality
itself, rather than by hiding it under a veneer of simplistic,
Clearly reality lies between "just complex physics" and "arbitrary
symbolic representation". The chosen symbols cannot be totally
arbitrary: they need to (and do) obey laws of physics. But
self-organised complexity shows us that those laws are potentially
richer (and more structured) than realised. Physics permits,
constrains, determines certain classes of symbols, but does
not constrain the actual ones chosen. Even this constrained space is
vast, and the realised actuality is just a small, arbitrary subset of
So if physics doesn't determine the symbols used, what chooses
them? It's that novel law that appears at the level of biology:
The complementarity that exists between
the material world produced by physics and the symbolic world produced
by natural selection can be explained by the logic underlying the
self-reference or recursiveness produced by the genetic code. The laws
of physics and natural selection operate as complementary constraints:
the laws of physics describe the unchanging part of phenomena, those
properties that living organisms cannot in principle dominate or
control. The theory of evolution through natural selection seeks to
explain the way in which living organisms do, however, progressively
improve their control over those laws.
This evolved symbolic mediation allows a relationship between one
kind of regime and a completely different one:
The role of the coding process is to
make the transfer from a chemical world in which, broadly speaking,
the objects (in this case segments of DNA) can be regarded as
exploring only one dimension of space, to a world in which other
objects, proteins, explore it in three dimensions, or even four if we
include time, because proteins can change their shape.
How does this "accidental" relationship arise?
the idea of a cause-effect relationship
between the structure of biological objects and their function is so
well established that it motivates the work of hundreds of thousands
of scientists around the world. But I have tried to show that the
causal relationship between the architecture of biological objects and
their function is often arbitrary and accidental. ...
adaptation occurs a posteriori,
and not a priori, because there is no final cause. The living
being that survives a critical situation did not know in advance what
would save it, but, having found the solution, and because it has
survived, it passes that solution on to its descendants, thus
preserving and multiplying it. Such a solution is always some way of
establishing a link or relationship between processes, events, or
objects. The link is part of a structure, but it was the function that
revealed it, so it is the function that ensures that the structure is
retained. Function does not create structure, but discovers it when it
So the laws of physics are important in biology, but not always in
the same way they are in the domain of physics itself. For example,
that old bugaboo, the second law of thermodynamics and the increase
of entropy, looks different through the lens of biology, where a
statistically "representative" ensemble is neither
realisable (because of the size of the state space) nor explored
(because of the selective effect evolution).
an increase in entropy, in accordance
with the Second Law of Thermodynamics, simply means that objects will
spontaneously explore all the environment accessible to them
... In this context, irreversibility
is simply the expression
of the fact that the total "space" of states and positions
available to the objects considered can only increase, in the absence
of any ad hoc constraint ...
However, we must insist that not
everything is possible, because time is also a crucial
consideration. It is meaningless to consider states that are
theoretically possible but are inaccessible for lack of time. Once the
number of objects considered is over a certain minimum, the number of
possible states is so vast that they cannot all be explored. This is a
fundamental flaw in the statistical model, right from the outset, and
it is important to bear this in mind when considering what happens in
real cases, but unfortunately this is almost never done. Entropy is
therefore nothing but a measure of the extent to which everything that
can be occupied is actually occupied, and an increase in entropy only
accounts for the exploration of all this new space (perhaps we should
say its creation, to mark the fact that it represents a
transition from a virtual state to a real state, since the nature of
the initial space was different from the nature it acquires when
explored, because of the possibility of new interactions).
Having made the point that biology adds a layer of symbolic
functionality and control on top of physics, he draws the analogy with
computation, or information processing, that also has these two
layers. (He is emphasising here that the functionality needs to be
considered in addition to the physical implementation; ironically in
computation the functionality is primary, and the fact of a physical
implementation is often neglected.)
between symbolic processes, which control the interactions between
objects, and the physicochemical nature of the underlying processes.
Provided that the machine can actually exist as a material reality,
its physical nature is not important, so long as it can establish the
necessary relationships between the strings of symbols. This duality
of the symbolic and the physical nature of things is a characteristic
feature of living organisms: they are compatible with physics, but
they cannot be deduced a priori from its laws. ... Physics
represents the inevitable and universal constraints on things, whereas
life will always try to take control.
Danchin makes the point that the physicochemical nature of the
underlying processes can be separated from the symbolic processes both
in biology and in computation. Laughlin
points out that when (emergent) properties are insensitive to the
substrate (as in this case) you can't draw conclusions about the
substrate from them. So we shouldn't expect to be able to draw
conclusions about the biochemical substrate from observing the
biological processes. Which is good -- it explicitly admits the
possibility of life based on other substrates.
So this control layer is (somewhat) independent of the underlying
laws of physics. We design this control into our computers;
life evolves this control:
It is precisely because the cell
functions using just local, basic operations (of the type
connect/disconnect, or presence/absence) that life is possible without
there being any external causality. ... It is the result of
the succession of a very large number of simple events, which became
organized essentially because this worked. The only systems
(organisms) that have survived are those which were able to bring
together relationships that were locally extremely simple and
probable, and to combine them in the structured way we know today.
Selection by existence (which is merely a principle of
stability) is an infinitely powerful way of discovering precisely what
is stable enough over time to be able to survive in a given
environment. One property of the stability principle is systematic
evolution toward ever-increasing control over the unavoidable physics
of the world. And the object of biology is to discover the principles
of this evolution toward increasing stability.
One mistake people often make when pursuing the computational
analogy is to assume that the program is all there is -- an approach
that developmental biologist Jack Cohen
for one strongly decries. Danchin does not make this mistake, but
explicitly brings in the role of the environment, providing data, and
providing the context where the symbols gain their meaning:
there is no one-to-one correspondence
between a gene and its expression. In particular, a gene may or may
not be expressed, depending on the cell's environment. This is obvious
in multicellular organisms such as mammals---a skin cell does not
express the same proteins as a brain cell, and when it divides it
produces more skin cells, not neurons, despite the fact that both of
them must have the same DNA content, and therefore the same genetic
program. This same program can thus produce different outcomes,
demonstrating that the external environment is an intrinsic part of
the way the program is expressed, because it contains the data
that determine the outcome. A cell can be defined as a machine that
puts the genetic program into operation according to the data
provided by its environment.
The laws specific to biology are able to
exist because of a particular aspect of their role: they do not affect
the nature of physical and chemical objects, but govern the
relationships that exist between certain objects. These objects have a
meaning, which is connected to their function in the
physicochemical processes of life. This gives them an original order
of abstraction, quite distinct from what physics tells us: .... This
space-time plan, this program that links together the material objects
of physics in order to compose a living organism, is an abstraction.
However, it cannot be regarded as arbitrary or as existing in itself,
without the material support of the physicochemical objects of life.
The links in question are not just any links; they have original
properties which we must try to understand. They are the result of a
continuing selection, in the normal course of an evolutionary
process that can be measured by the survival and existence of the
organisms in question.
Danchin takes the computational analogy further than most. For
example, he considers the Kolmogorov (algorithmic) complexity of the
genome, and what it might tell us:
compress the sequence
understand how the sequence has been generated in the course of
evolution. A genome is not a random piece of DNA, but the result
of evolution through duplication, recombination, mutation, and so on,
and all these processes could be described in terms of algorithms.
Here again, time is of crucial importance: an algorithm is no good
if it takes too long to execute! This relates to a complexity in terms
of algorithmic, or logical, depth:
Given a particular sequence, we will
want to look for algorithms that will generate it, but we must always
keep in mind the parallel importance of evaluating the program's run
... we should never speak of what is
potential in the same way we do of what is real. It may be
meaningless to speak of potential, because it may be impossible to
realize that potential explicitly in the time available.
Danchin goes all the way to Godel and the Halting Problem:
Although it is not possible to go into
detail here, the connection between the halting problem and the finite
character of genome texts, if they are considered as algorithms,
suggests that their formal properties are worth studying in detail, as
a source of mathematical conjectures. By the very fact that they
exist, they prove that it is possible for an algorithm to have a
critical structure, a critical depth, which is related to
their capacity to reproduce themselves in a given environment, while
at the same time producing the machine that runs them.
It is a real pity he doesn't go into further detail -- it was
His dislike of the modern application of physical complexity science
to biology resurfaces, and he instead describes developmental systems
in terms of construction via algorithmic description:
Because many reproducible structures
exist in physics (branching structures, cells, circles, spheres, and
so on), many thinkers looked to certain physical or mathematical
principles to explain the genesis of forms in biology. According to
these ideas, life has simply rediscovered the general principles that
. This horribly reductionist, Platonist attitude
prevailed for a long time. It is still sometimes popular among those
who know nothing of biology, because they fail to understand two vital
things: first, that the functions which construct, or which ensure
control, have an essentially symbolic role; second, that the important
form that is preserved in organisms is not the final shape, but the
form of the algorithm that constructs it.
Life certainly uses the principles of physics
but just as
a basic vocabulary, a set of elementary processes, organized into a
program, not as the main construction principle of life.
Algorithms provide iteration and (spatial and temporal)
combinatorics, which lead to a biological-style complexity of
The processes are all extremely simple
in themselves, but the way they are strung together is complex,
because it is compartmentalized in space and time. Although the
diversity of the control elements is limited, their combinatorial
possibilities are extremely rich.
What preexists is not the organism
itself, but the preformation of a development algorithm.
what heredity passes on is not the form, but its construction program.
The successive expression of control genes, activated or
suppressed one after another, enables morphogenesis to take place
(while respecting and making use of the constraints of physics, of
course, such as the rules of overall symmetry).
Messing about with this developmental program can have macroscopic,
structured effects, such as growing legs where antennae should be.
The organization is so hierarchical that
modifying a single gene, Lim-1, produces animals without a
Even though all organisms have a control level, it can be more
sophisticated in sme than in others:
there is a significant difference
between mammals and insects. In mammals, instead of a single linear
arrangement corresponding to the layout of the insect, there are four
linear arrangements, arranged exactly as in the fly, and also
corresponding to the animal's development from the tail to the head.
... This discovery accounts for mammals' greater complexity compared
to insects: the construction algorithm is produced by the combination
of four homologous procedures working simultaneously. It also explains
how the segmented character so visible in insects (mostly at the
larval stage, of course) is much weaker in mammals. We can also
definitely see signs of evolution by duplication of the
genetic program, which suddenly makes new properties appear---the
effects of duplication are not only quantitative, they also create new
relationships de facto.
In programming, it is important to be able to remove old objects as
well as create new ones. The same is true of the developmental
approach: scaffolding is erected, then removed:
During this development, certain cells
are programmed to disappear, leaving room for other cells which are
differently differentiated, and which could not otherwise have
developed. It is thus important to note that development includes a
significant element of absence, as distinct from presence, so
that a "negative" form plays a role in development that is
just as important as that of a positive presence.
This focus on the processes and relationships between objects within
the cell leads to a definition of life here in terms of four features:
The processes that make life are metabolism,
compartmentalization, memory, and manipulation.
Metabolism and compartmentalization are organized by small molecules
(comprising a few tens of atoms, with a carbon skeleton), whereas
memory and manipulation are controlled by nucleic acids and proteins,
so the scale of their basic components is that of macromolecules ...
Two spatial scales are thus interlinked in all living processes, which
operate on a mesoscopic scale, intermediate between our
macroscopic world and the microscopic world of atoms. This is the
scale that is revealed in the geometrical program superimposed on the
genetic program in the genome.
This does not fully carry over into the computational analogy:
Reconciling all these processes has
seemed so difficult that ... at the conceptual level, when comparisons
have been made between life and Turing machines, the general
principles for the construction of a self-replicating machine have
nearly always overlooked the need for compartmentalization and
This definition of life might seem to lead to a clear answer to the
problem of viruses, but it is, of course, never that simple:
This means that organisms such as
viruses, which do not metabolize, cannot be considered to be
straightforward living organisms. They must be studied for what they
are: pure parasites, a memory that perpetuates itself at the expense
of a genuine life, that of the cell they have infected. Of course they
are not similar to the usual non-living matter found on the Earth;
they seem to be artifacts created by life
It might seem that 300-odd pages is a long time to say "life
has evolved symbolic relationships between its objects, and has an
algorithmic development program". But there is much more to it
than that. The thesis is backed up by detailed biological
explanations, juicy physics and computational explorations, and
interesting excursions into the philosophy of science. For example, he
has some important things to say about the practice of (biological)
science. In particular, on the important role of models, theory, and
abstraction in science, when we need to move beyond "stamp
collecting", beyond observed phenomena, he says:
It is difficult to connect the text of
genomes with biological functions. Knowing the text of a gene,
predicting the sequence of the protein it specifies, visualizing its
architecture, does not directly give us its function. The best we can
do is to modify the gene or inactivate it and to study the genetically
modified organism. But then we are faced with the difficult situation
of studying phenomena
What is the best way forward? How should
we interpret what we observe, and avoid taking our wishes for reality?
Unlike in a number of domains of physics, where phenomenology is
already well established and the theoretical, a priori
approach is highly developed, we are not in a position to make a model
of what we want to observe according to the criteria I have outlined.
First we must observe and account for a phenomenon: growth under
certain conditions, use of a particular molecule, sensitivity or
resistance to a particular variation of a physical parameter. Simple
phenomenology, because of its approach in which observation is only
very loosely connected to a well-defined and delimited theoretical
corpus, is on the borderline between science and an unstable form of
thought, often close to a kind of primitive magic. This is not often
recognized, but it explains why a large part of scientific work, even
work that is institutionally recognized, is in fact of very little
value in advancing scientific knowledge. It also explains the
existence of many activities in the field of biology that are close to
ignorance or even fraud.
This is a passionate book. It is a translation from the French, and
in the acknowledgements he thanks his translator for
making the transpositions required by
the move from a Latin culture to an Anglo-American one. On the
whole, this succeeds, but I feel there is still a French style peeking
through in places, particularly in the philosophical stance. This is a
good thing; it would have been sad to lose this flavour in the
Danchin ends on a slightly depressed note, with references to 9/11,
and the smallpox virus, but I think there is optimism in the
what we create cannot be reduced to what
Recommended -- but expect to take some time over it.