Chosen in Feb/March 08; taken May-Sep/08 (full-time MSc), Oct/08-Mar/June
09 (undergraduate 3 year /
MEng, MMath).
My email from inside the department is “jac”.
I can be found on the upper floor of the main CS building in office CS201B.
John Clark will consider student-defined projects in computer security, optimisation (genetic algorithms, simulated annealing etc), safety, quantum computing, formal methods and software testing.
If you wish to propose an CS/IT related project concerned with science (e.g. related to protein folding, physics simulation, genetics, mathematical demonstrations, or whatever) I will be happy to consider it; ***you*** should look to provide the science for your particular interests and I will advise on the computer science. I have some knowledge of science generally and of mathematics in particular. This is not intended as being aggressively computer science; it may appeal to MSc IT students who wish to suggest topics of specific interest to them. I also have some interest in foreign languages but again, I agree only to supply the CS advice.
I am also for my sins a computer security specialist.
(Note: other members of staff may have specific interests in these fields too). I may accept other suggestions if they are peculiarly interesting. I have a penchant for things that are a bit wacky.
JAC/02- Solving by Drag and Drop [CS/M,MMath,MEng4]
Some problems are easy to solve and some are hard. Some problems look like others. This project will develop a software framework to investigate a very particular approach to problem solving.
The basic problem solving idea is to start with a simple problem with an easily found solution (P0,S0). Now suppose what you really want to solve is Phard, with (unknown and hard to find) solution Shard. If a problem is similar to another one it may well have a similar solution. Can we find a series of problem mutations (P0,P2, ...,Pn) such that at each point Pk we can easily dervive Sk from Sk-1. Assume Pn here is Phard, etc. If so then we can progressively mutate the problem into the desired one and 'drag' the solution with the problem changes (by successively reclaculating the delta each time. For example, to solve a quintic polynomial equations, i.e. ax^5+bx^{4}+...+ex+f=0, you could start with a known factorisation of some polynomial (x-z1)(x-z2)...(x-z5)=0 and then progressively deform this polynomial until it becomes the desired one, the roots of the progressively deformed polynomials calculated by local search starting at the current roots. (There may be some additional restrictions placed on the starting polynomial, such as equal number of roots etc.). This example is offered for general motivation only. What interesting problems can be solved in this way? What mathematical problems can be solved in this way? Interesting differential equations? Interesting topological or geometric problems?(Can we deform an existing 6-triangles solution to the Venn diagrams problem into an 6-equilateral triangles solution? See [3] below).
The idea is pretty much novel I think. A comparative study could be carried out for identified problems with attempts to search for solutions to hard problems directly, e.g. via evolutionary search or simulated annealing (see [1][2]). The framework developed should facilitate the incorporation of such additional search engines.
[1] "Genetic Algorithms in Machine Learning and Optimisation ",
David E Goldberg. 1989.
[2] "Modern Heuristic Search Methods". Eds. Rayward-Smith, Osman,
Reeves, Smith. Wiley.
[3] http://www.theory.cs.uvic.ca/~cos/venn/VennOpenEJC.html
JAC/03
The Ford Classification Challenge
[MEng, MMath, MSc NC]
This project is concerned with the investigation of techniques to classify data sequences. Essentially We are given training sequences of signal data when a problem is present in a system and when it is not. The aim, given the training examples, is to produce a classifier that will correctly classify additional samples (i.e not in the training set). This problem has been set as a competition challenge to the research community as part of the 2008 Conference on Evolutionary Computation. See http://home.comcast.net/~nn_classification/ for details.
The project will seek to use genetic programming (GP), and possibly grammatical evolution (GE), to evolve classifiers. Getting to grips with GP is unlikely to be hard. There are some very good toolsets that are well documented (e.g. ECJ). [You can be solving simple problems in a hour or so by following the tutorials.] There is considerable scope to develop a language suited for this purpose.
The project will investigate how well GP (or GE) can be used to evolve classifiers for the above problem. By the time the project is taken the research community will have submitted its proposals to the competition mentioned above. There will be an “honours list” available giving the best results achieved. Getting a good mark on this project does not mean you have to beat the best of the research world! However, if you can you are free to draw attention to this fact in your project write up. This is a bit of fun.
Pre-requisites. Strictly none. But you will be expected to interface with Java based optimisation toolkit (ECJ) and so some familiarity with Java would be advantageous. Similarly having some knowledge of genetic programming would be beneficial (but is not essential). Some knowledge of digital signal processing might be advantageous but is not essential.
[1]
Genetic Programming : An Introduction : On the Automatic Evolution of Computer
Programs and Its Applications (The Morgan Kaufmann Series in Artificial
Intelligence) (Hardcover)
[2]
http://www.grammatical-evolution.org/ This has lots of academic papers on
grammatical evolution.
[3]
The Ford Classification
Challenge CEC 2008 Competition
[4]
ECJ Home page
JAC/04
Quantum Algorithms Synthesis Workbench [MEng, MMath, MSc NC]
This
project is concerned with the automatic synthesis of quantum circuits and
algorithms. The project will involve:
·
reviewing
literature in automated quantum circuit and algorithm synthesis ([1] is a good
place to start!);
·
developing
an optimisation based framework for evolving quantum circuits and algorithms.
You will adhere to reasonable principles of good software engineering
construction. (This is not to be ‘hacked together’.) The work will also make use of existing genetic programming
software (ECJ toolkit).
·
Experimenting
with the developed framework to develop implementations given a variety of
specifications.
Pre-requisites. It does not make sense
to take this project unless you have done or will be doing the QIP module (or
have equivalent experience.)
Comments. This is both a
development project (I want to be able to use the framework after you have
finished!) and an experimental evaluation of the utility of the developed
framework.
References.
1.
Susan
Stepney, John A. Clark. Searching
for Quantum Programs and Quantum Protocols: a review.
Journal of Computational and Theoretical Nanoscience (accepted).
2 Paul Stephen Massey. PhD
Thesis. Searching
for Quantum Software.
JAC/05
Automated Reputation Synthesis [CS,MEng, NC, SWE]
Many systems work on the basis of “reputations”. Ebay is a good example. Another example is Slashdot.
But these are rather static systems. What happens when you are a node in mobile ad hoc network (MANET)
and you seek to determine whether you should trust some other node? You can use your
direct experience of that node and you may like to consult with others about what they think. However,
combining these pieces of information in an effective way is a difficult task. Every “trust calculus” or “risk calculus”
starts with a researcher’s idea of what the calculus should be. This project will investigate whether evolutionary
computational approaches such as genetic programming can be used to evolve ways of calculating appropriate values.
The project will involve: the development of a very basic network simulator where nodes wander around some confined space, occasionally requesting services from other nodes. A service provider will be deemed to provide a service with some specific reliability (e.g. 0.95). Note that the specifics of the service do not matter, only whether it was actually delivered successfully. The simulator will allow relevant data to be filed. This data will then be used as the basis for testing any risk calculus.
Once this has been developed, we will seek to use genetic programming to evolve reliability calculations. This should allow expression of approaches such as “ask all neighbours within four hops and take the lowest estimate of reliability”. In practice, we really just need
to give a GP system appropriate sets of terminals and functions.
A review of reputation systems (with references) can be found at
Yow Tzu Lim’s PhD Qualifying Dissertation (here).
JAC/05B Automated Reputation Synthesis II
[MEng]
Reputation synthesis has proved very popular. This project has the same motivation and context as JAC/05 above but will take a more deeply rooted research focus. It will investigate the use of alternative means of growing calculi, e.g by grammatical evolution and consider whether. We can consider issues such as: how many rogue nodes does it need to make a system collapse? How resilient is a system to occasional rogues? Time permitting it would be real fun to try to grow attacks on reputation based systems. Can we evolve a strategy for attacking a reputation system?
The student should be prepared to use existing evolutionary computation tools, e.g. ECJ.
JAC/06
Automated Invariant Synthesis [CS,MEng, NC, SWE]
Some things don’t change, they are invariant. Within software development invariants are useful in facilitating proofs of correctness. They are assertions which always hold at particular points in a program. Consider the program below:
i=0;
Sum=0;
while(i<n)
{
sum=sum+i;
i=i+1;
}
Consider now the assertion sum=0+1+2+…+i , i.e sum is equal to the sum of the numbers between 0 and current value of i.. This is clearly true when we meet the while statement for the first time since i=0 and sum=0. It is clearly true after any execution of the loop body if it were true before execution. It is an invariant of the loop. Other invariants may be simply states like x>y at a particular point in the program.
Michael Ernst has proposed a means of hypothesising invariants and then seeing which are falsified by test execution traces. What is left has a high probability of being true. This project will investigate whether evolutionary approaches such as grammatical evolution or genetic programming can be used to evolve invariants. The student will design and implement an invariant synthesis systems that takes in data types, execution trace values at points in a program and generates invariants. The student will then code and instrument algorithms in Griess’ book “A Science of Programming”. (This is easy.) These will be used to evaluate the usefulness of the synthesis engine. (The algorithms come complete with invariants). A comparison will be made with the Daikon tool by Ernst.
Forming invariants is trivial – “true” is always true, but forming useful invariants is the real challenge.
[1]
Genetic Programming : An Introduction : On the Automatic Evolution of Computer
Programs and Its Applications (The Morgan Kaufmann Series in Artificial
Intelligence) (Hardcover)
[2]
http://www.grammatical-evolution.org/ This has lots of academic papers on
grammatical evolution.
[3]
A Science of Programming . David Griess.
[4] http://pag.csail.mit.edu/~mernst/pubs/ Michael Ernst’s Web Page. In particular:
· “The Daikon system for dynamic detection of likely invariants”
· “Automatic generation of program specifications”
JAC/07 Text Line Segmentation for Historical
Documents [MEng,
MMath, NC]
When you look at a medieval manuscript it is fairly easy to discern what the “lines” of text/script are. However, it is
Not particularly easy to do this by computer.
This project will investigate whether line segmentation algorithm can be enhanced by the use of evolutionary computation.
After a literature survey ([1] is a very good place to start) the work will initially involve defining a language
(terminal and function sets) and using the ECJ (genetic programming) toolset to try to evolve an algorithm to carry out line
segmentation. It will be interesting to try to develop a simple but novel approach, e.g. by specifying a behavioural algorithm of a
a tiny ant as it moves from left to right between lines.
A number of historical document images are available to act as test cases for any developed algorithms.
Some familiarity with Java and processing images in Java (though this is pretty straightforward) would
be advantageous but is not essential.
[1] Text Line Segmentation of Historical Documents: a Survey. http://arxiv.org/ftp/arxiv/papers/0704/0704.1267.pdf
JAC/08 Searching for Mathematics
[MEng, MMath, NC]
The aim of this project is to take non-standard computational search into mathematics.
Sometimes we may wish to find subsets of sets satisfying particular properties. One such example is finding a
subgroup of a given group. This project will investigate whether subgroups can be discovered using
non-standard computational search techniques.
The student will apply appropriate optimisation techniques to finding structures with appropriate properties.
[1] "Genetic Algorithms in
Machine Learning and Optimisation ", David E Goldberg. 1989.
[2] "Modern Heuristic Search Methods". Eds. Rayward-Smith, Osman,
Reeves, Smith. Wiley.
[3] http://www.theory.cs.uvic.ca/~cos/venn/VennOpenEJC.html
JAC/09
Safety in Numbers [MEng, NC]
The aim of this project is to investigate how program evolution (e.g. via genetic programming [1]) can be improved by suitable use of redundancy. Evolved programs tend to be imperfect. However, the safety community has made significant use of redundancy to handle imperfections in components. A population of n programs is maintained and a “vote” is taken by the population as to what the “result” is. Assuming independence of failures we can be confident that the majority of voting nodes vote correctly. A typical implementation is a 2-out-of-3 system, where any result calculated by two or more components is taken as the actual result. (If all three disagree, you have a real problem and might like to consider shutting down!) In this way handling spurious hardware errors can be catered for.
Combining results from evolved programs is a trickier matter. Programs evolved using the same cost function may well behave similarly. In particular they may have the same strengths and weaknesses. Some inputs may be well handled by all programs but others may be badly handled by all. We need to engineer a population of programs which exhibit complementary strengths and weaknesses. The ability of any particular program to handle an input well will differ across inputs. How can we evolve populations of programs so that on any particular input we can be confident that a majority will get the right result?
We clearly need to build notions such as complementary weaknesses into the evolution process. This project
will investigate how to do this. The student will seek to use existing theory from elsewhere (e.g. negative correlation learning [2]) but also
should be prepared to generate new ideas about how to achieve this as the project progresses.
[1] Genetic Programming : An
Introduction : On the Automatic Evolution of Computer Programs and Its
Applications (The Morgan Kaufmann Series in Artificial Intelligence)
(Hardcover)
[2] http://www.cs.bham.ac.uk/~xin/research/ensemble.html (Xin Yao’s web site at Birmingham. See
references [5] and [6], which can be downloaded.)
JAC/5B Automated Reputation Synthesis II [CS,MEng, NC, SWE]
This project is simila to JAC/05 above