JAC's Project Suggestions 2000-2001 (for choosing in April 2001)


JAC/0 - Student Defined [CS, MSc SWE]

John Clark will consider student-defined projects in computer security, optimisation (genetic algorithms, simulated annealing etc), safety, quantum computing, formal methods and software testing. I may accept others if they are peculiarly interesting (i.e. a bit wacky).



JAC/1 - Automated Intrusion Detection [CS, MSc SWE]
Likely to be unavailable due to JAC/2 below, but I am open to suggestions.

Intrusion detection comprises a set of techniques to indicate when malicious action is taking place in a computer system. Techniques range from monitoring of statistical properties of user behaviour through to expert system techniques. This project will investigate the state of the art in intrusion detection techniques and produce an implementation using a selection of techniques examined. It is envisaged that the project will concentrate on being able to detect whether users operating at a terminal are really who they say they are (or whether someone has left the terminal unattended and someone else is masquerading as that user). The behaviour attributes monitored should be varied; they might include typing patterns, command usage patterns etc. The design should aim to accommodate rapid reconfiguration. After producing the system, trials of its usefulness must be undertaken. This will involve the capturing of user behaviour in some form (e.g. by recording Unix command usage etc).

  1. "Automated Intrusion Detection", Leigh Rowland and John A Clark, High Integrity Systems Journal, Vol 1. No 4 1995. Available on request.
  2. "Intrusion Detection: An Introduction to Internet Surveillance, Correlation, Trace Back, Traps, and Response". Edward G Amoroso.


JAC/2 - Combining Anomaly Based And Signature Based Intrusion Detection Systems [Student Defined].


The aim of this project is to combine two different approaches to the signal detection problem of distinguishing an attack from normal usage of a networked computer system. The two main approaches to this problem are misuse detection and anomaly detection [1]. Misuse detection relies on defining negative behavior whereas anomaly detection relies on defining positive behavior. In both cases negative behavior is flagged. Relative strengths and weaknesses of anomaly based and signature based (misuse detection) IDS will be investigated.

The project will result in a prototype IDS with the best characteristics of both methods. This prototype IDS should be capable of accurately classifying common attacks [2] whilst detecting new and novel attacks. This system should then be evaluated by using it under specific conditions that include both legitimate traffic and intrusion attempts. The intrusion attempts used should be a mixture of some which appear in the signature database being used and some which do not. The following should be observed and recorded:

1. The number of false positives.
2. The number of false negatives.
3. The effect on the speed of the system running the IDS.

References:

[1]  Roland Buschkes, Mark Borning, Dogan Kesdogan. Transaction-based Anomaly Detection. In Proceedings of the Workshop on Intrusion Detection and Network Monitoring. April 1999.

[2]  Teresa F. Lunt. Detecting Intruders In Computer Systems. 1993 Conference on Auditing and Computer Technology.

[3]  Edward G Amoroso. Intrusion Detection: An Introduction to Internet Surveillance, Correlation, Trace Back, Traps, and Response.
 


JAC/3 - Mutation Testing for FPGA Software [CS]

Handel C is a programming language for FPGAs. FPGAs are essentially rapidly reconfigurable hardware conisting of reconfigurable arrays of very simple cells. However, this is implementation detail. What is important for this project is that it is possible to use high level software languages (here Handel C) and compile down onto this heavily parallelised platform. The issue arises - how should we judge the testing of such software? Mutation testing takes a program and then creates a large number of  "mutant" programs that differ from the original in a particularly small way, e.g. a minor syntactic deviation (+ to a - sign etc). If your set of test cases can distinguish the original from the mutants it is lilely to be a thorough test set - the term 'mutation adequate' is often used.  This project will create a mutation tool set for Handel C. This will generate mutants, compile them and run test sets to determine adequacy. You should like IPL, and be prepared to use Lex, Yacc (or whatever language variants takes your fancy). It might also be possible to develop a simpler mutation tool that operates directly on the FPGA cell configuration.
    "Software Fault Injection ", Jeffrey Voas and Gary McGraw. 1999.


JAC/4 - Mutation Testing for Concurrency  [CS, MSc SWE]

Communicating Sequential Processes (CSP) is a formal notation for modeling and reasoning about
concurrent systems. Model checking is a fully automatic (quasi-enumerative) technique for verifying properties of systems that evolve over time. You assert  "P is true of my system" and the model checker attempts to find a counterexample. There is a model checker available to verify properties of CSP models. This is called FDR. Systems are generally defined as the combination of many smaller subsystems (also in CSP).  In some cases, one might actually wish certain properties to remain true of systems that have undergone failure of a component. For example, you may have a  safety property that remains true even when a component fails in a particular way, i.e. you have  a resilient system.  This project will investigate the systematic derivation of mutant systems. It will propose a set of mutation operators for CSP, and develop a tool to generate mutant systems (in CSP). It will then use FDR to either confirm that particular properties are maintained under mutation, or else indicate a "test" that distinguishes the original from a mutant. Some familiarity with lexing and parsing is required and corresponding tools(e.g. Lex, Yacc or similar).
 
    "Software Fault Injection ", Jeffrey Voas and Gary McGraw. 1999.
    "Communicating Sequential processes". A. Hoare. Prentice Hall (in library).


JAC/5 - Java Testing [CS, MSc SWE]

This project will develop a Java Testing Tool. To determine whether a set of test cases exercises code appropriately, various 'coverage' criteria have been propoed. An obvious one is that every statement should be exercised by at least one test case. This project will develop a coverage monitoring tool for various criteria (statement coverage, branch coverage, MC/DC coverage, data flow coverage etc.). You should like IPL (or have experience with compiler concepts), and be prepared to use JLex, Jacc. You should also be familiar with Java.
    "Software Testing Techniques ", Boris Beizer (in library).
       

JAC/6 - Non-Standard Optimisation [CS,CS/M]

Some problems are just computationally hard. The Travelling Salesman Problem is a good example. Calculating the smallest distance tour visiting each of  N cities is generally computationally intractable. However, in such cases we often throw 'heuristic' technques at the problem. These are techniques that are good at finding good (but not necessarily the best) solutions. The general approach is that you feed the problem to the technqiue and get out a solution.  But we can be more adventurous. What would happen if we mutated the problem a little and then tried to solve it?. What is the relationship between the solutions obtained for this mutant problem and the solutions for the original? Can we identify patterns in the results of multiple attempts to use an heuristic technique. Can we fix certain variables at different values and use the results obtained to suggest which is the more likely to be the best? This project will investigate such approaches for specific problems of interest (e.g. TSP, Permuted Perceptron Problem, MaxSAT etc.)
    "Genetic Algorithms in Machine Learning and Optimisation ", David E Goldberg. 1989.
    "Modern Heuristic Search Methods". Eds. Rayward-Smith, Osman, Reeves, Smith. Wiley.

JAC/7 -  Heuristic Boolean Function Creation [CS,CS/M]

Boolean functions are functions f:{T,F}n ->{T,F}. In cryptography we often wish to derive Boolean functions that have particular characteristics. For example, we might want a function that is highly non-linear, which essentially means that linear expressions  X1.xor.X3.xor.X7 etc. are poor approximations for the function. Low autocorrelation is another feature often required. This project will investigate the use of heuristic optimisation techniques (Simulated Annealing, genetic algorithms etc.) to generating such functions and will develop a toolkit for experimentation. You should like cryptography (i.e. CCT).
    "Genetic Algorithms in Machine Learning and Optimisation ", David E Goldberg. 1989.
    "Modern Heuristic Search Methods". Eds. Rayward-Smith, Osman, Reeves, Smith. Wiley.
    "Two Stage Optimisation in the Design of Boolean Functions" John A Clark and Jeremy L Jacob. Proceedings of the 5th
        Australian Conference on Security and Information Privacy 2000 (ACSIP 2000). On JAC's website.



JAC/8 - Heuristic Boolean Approximation Generation [CS,CS/M]

As a cryptanalyst I would  like to be able to generate approximations for the operation of a cryptoalgorithm. For example, if Ps Ks and Cs denote plaintext/key/ciphertext bits, then  the expression P1.xor.P7.xor.K3.xor.C9 .
might be true 50.1% of the time. i.e. for a randomly chosen plaintext P and key K (and resulting ciphertext C) the expression is true with probability 0.501. This project will investigate the use of genetic programming techniques to derive such approximations, leading to the production of a "genetic approximator".  It is not at all clear what the space of approximations should be. The above example is linear (i.e. is a repeated XOR expression) but higher order expressions should be beneficial. The tool should allow the user to configure the space in which approximations are to be found. You should like cryptography (i.e. CCT).
    "Genetic Programming" Vol 1. Koza.
    "Two Stage Optimisation in the Design of Boolean Functions" John A Clark and Jeremy L Jacob. Proceedings of the 5th  Australian Conference on Security and Information Privacy 2000 (ACSIP 2000). On JAC's website.

JAC/9 - The Aggressive Falsification of Synthesised Specifications[CS]

Michael Ernst has developed techniques that synthesise specifications from test data. Essentially
a whole host of assertions are generated at points in the program under test
using variables from that program. The program is executed  with test data. If any assertion
embedded in the program is false for any test datum then that assertion is not an invariant for the program.
After a good deal of testing some of the generated assertions are found to be true for every
test execution. These may putatively be designated as a partial specification for the program. Some of these remaining assertions may not actually be invariants. The test set may simply be inadequate to demonstrate a counter-example. It would seem that very aggressive counter-example generation is required. Nigel Tracey
has recently developed techniques that use evolution approaches (e.g. genetic algorithms) to home in on test data that cause assertions to be broken.  This project will aim to combine the two techniques to obtain more reliable specifications.
    "A Search Based Automated Test Data Generation
    for High Integrity Systems." Nigel Tracey, John Clark, Keith Mander
    and John McDermid.  Probably easiest to access from my website.

    ``Dynamically Detecting Likely Program Invariants'' by Michael D. Ernst. PhD dissertation, University of Washington, Department of Computer Science and Engineering, August, 2000.
     
     


JAC/10 - GeometryCompression [CS] (Self-defined Henry Robinson)


JAC/11 - Safety Measurement [MScSCSE] (Self-defined Paul Caseley)