Fractal image compression

Notes for a talk I gave on fractal image compression, in August 1991.


Abstract

Fractal image compression offers enormous compression ratios. I'll talk about how an image is reconstructed from compression data, and how images are encoded, and the errors introduced.

Encoding

Consider an image like this square.

I am going to replace it by three copies of itself, each transformed differently.

Taking the origin as the bottom left corner, the first copy is shrunk to half the size of the original, the second is shrunk and moved up, the third is shrunk and moved up and right.

(You can do this quite easily for yourself using a drawing package. I used MacDraw.)

[square]

The result is these three smaller squares.

Now I do exactly the same thing again, but using this image as the original to be transformed. So I take three copies of this picture, and transform them as before.

In fact, I will keep iterating, and repeat the whole process four more times.

[3square]

The result, after recursively applying the transformations five times, is this triangular shape.

In the limit, we get the famous fractal called the Sierpinski triangle, or Sierpinski gasket.

So, the original square, plus the three simple transformations, are all the data I need to describe this very detailed image. In fact, the compression factor is even better than that.

[gasket]

Unique final picture

The final image is independent of the original image; it is uniquely determined by the transformations alone.

So the Sierpinski triangle is fully described by the three transformations, and nothing more.

I can demonstrate this by taking a very different initial picture; an ellipse rather than a square. I will apply exactly the same transformations to this, five times again.

[ellipse]

And this is the result.

You can probably still see the little ellipses in this diagram.

But you can see that they are tending to the same limit, even though they were very different initially.

[etriangle]

But if I take a different set of transformations , I get a different result.

Here I have taken four copies of the original each time, and translated them differently.

The result is very different from a Sierpinski triangle.

[different]

Affine transformation

Each of these fractal images is defined by a set of affine transformations .

An affine transformation is a linear transformation (a scale change, a rotation, a shear, or a reflection) as described by a 2×2 matrix A , plus a translation as described by a vector b :

w x = A x + b

So a single affine transformation w is defined by just 6 real numbers.

The Sierpinski triangle was defined by three affine transformations, or just 18 real numbers. This is the reason behind the fantastic compression ratios possible with fractal compression.

Families of pictures

If two pictures are related by a linear transform t , then their encodings are also related.

So once you have the encodings for a picture, you automatically get for free the encodings of all the other pictures related to it by linear transformations.

[leaf collage]
{ w i } { t o w i o t ~ }

Rendering algorithm

Given the details of the affine transformations, you can recreate the image as I've described, by using a drawing program to copy areas of the screen, shrink them, rotate them, and move them.

But that isn't a good way for a program to draw them. Instead, you can use the Rendering Algorithm :

  • start with a single point
  • repeat :
    • choose one of the transformations at random
    • transform the point to a new position
    • plot it

After many iterations, you get a picture that begins to look like the Sierpinski triangle. In the limit, this picture converges to the same result, independent of the random sequence of transformations. (So the algorithm is parallelisable: different processors can use different random sequences, and overlay their results.)

[rendered]

Collage Algorithm

So, given transforms, we know how to recreate the image. To compress an image, we first have to find the transforms.

The Collage Algorithm gives a way to do this: completely cover the original image with lots of little transformed copies of itself.

The top left picture shows a leaf shape covered by three little, similar leaves. The top right shows the picture defined by these transformations.

The bottom shows a poor collage, with a final picture that doesn't look much like the leaf.

[leaf collage]
[figure 3.10.1, Barnsley 1988 ]

Collage Theorem

The Collage Algorithm tells us how to find the transforms; the Collage Theorem tells us how close the resulting picture will be to the original.

If the difference d (the Hausdorff measure) between the original and the collage is less than e , then the difference between the original and the generated image is less than e / (1- s ), where s is the size of the largest collage piece (so don't use large pieces!):

d (original, collage) < e => d (original, final) < e /(1- s )

This gives an objective measure of the error introduced by the compression. The error can be reduced by having more pieces in the collage, covering the original better, but reducing the compression ratio. So there is an objectively measurable tradeoff.

Stability and interpolation

The compression encodings do not exhibit the common chaotic property of 'sensitive dependence on initial conditions'. Which is fortunate, because such encodings would be useless if a small change in the transforms, say a rounding error, caused a very different picture. But a small change in the encodings gives only a small change in the resulting picture (this result is implied by the Collage Theorem).

So it is possible to interpolate between two images by interpolating between their transforms. The picture shows interpolation between two cloud images.

[cloud movie]
[plate 9.8.5, Barnsley 1988 ]