Alistair Edwards' Project Suggestions

Modern voice recognition systems can be used to provide real-time captioning of television programmes (Pirelli, 1998). There is a problem, though, in that even the best such systems are less than perfect; mis-recognitions do occur. There is a need, therefore, for techniques for rapid editing of the text being generated, so that the text stream seen by the viewer is (more) correct. The objective of this project would be to investigate such techniques. Possible techniques would include:

The project could probably be undertaken in two phases. The first would be a 'theoretical' study in which keystroke level models (Card, Moran and Newell, 1983) could be used to predict the best interface. It could then be implemented and the prediction tested. The project would thus be similar to that of Wood (1993), upon which the student could draw.

References

Card, S. K., Moran, T. P. and Newell, A. (1983). The Psychology of Human-Computer Interaction. Hillsdale, New Jersey: Lawrence Erlbaum.

Pirelli, G. (1998). The VOICE project(I): Giving a voice to the deaf by developing voice to text recognition capabilities. (in) Computers and Assistive Technology, ICCHP '98: Proceedings of the XV IFIP World Computer Congress, A. D. N. Edwards, A. Arato and W. L. Zagler (Eds.) pp. 51-60, Vienna & Budapest, Austrian Computer Society.

Wood, N. (1993). Optimising speed and accuracy of input, MSc(IP) Project Report, University of York, Department of Computer Science

ACM Computing Classification

ADNE/02: Creating pleasing phone rings [allocated]

The sounds made by mobile phones when they ring are often annoying. Why is this so? One component is probably due to the fact that they are played over speakers of tiny dimensions. Simple physics dictates that such speakers cannot generate low frequency tones. Hence ringing tones are high pitched, which may be one source of annoyance.

Another consequence of the high frequency, narrow bandwidth tones that are used is that the sounds are hard to localize spatially. This leads to problems in any environment in which there are several phones (i.e. anywhere!) When a phone rings, it is not easy to work out whose phone is ringing in terms of its location. Instead we can customize the tune that the phone plays. In this way we recognize when it is our phone. It may be though, that this is another source of annoyance. Some people find it offensive to hear a piece of (say) Mozart that is intended to be played by an orchestra, beeped on a phone.

Is it possible to design sounds that can be generated by a tiny speaker but which are not annoying? For instance, traditional ringing tones seem to cause less annoyance than tunes (but more confusion).

About the only consensus in the literature is that annoyance is completely subjective. It is not possible to characterize in any definitive manner what is or is not an annoying sound. Nevertheless, the first objective of this project will be to try to get a handle on some of the sources of annoyance. Then it will be possible to carry out experiments to measure the degree of acceptance of some alternative tone designs - which have been designed in an attempt to be less annoying.

The project would require the design of an experiment to measure the liking/annoyance for different rings, in some ecologically valid manner. The experiment will be carried out and its results analysed.

The project will require skills in experimental design and some programming – including the use of a sound card.

References

Abel, S. M. (1990). 'The extra-auditory effects of noise & annoyance: An overview of research.' Journal of Otolaryngology 19(1): 1-13.

Berglund, B., K. Harder and A. Preis (1994). 'Annoyance perception of sound and information extraction.' Journal of the Acoustical Society of America 95(3): 1501-1509.

Dornic, S. and T. Laaksonen (1989). 'Continuous noise, intermittent noise and annoyance.' Perceptual and Motor Skills 68: 11-18.

Guski, R. (1996). Interference of activities and annoyance by noise from different sources: Some new lessons from old data. Contributions to Psychological Acoustics: Results of the Seventh Oldenburg Symposium on Psychological Acoustics., Oldenburg.

Landström, U., P. Löfstedt, E. Åkerlund, A. Kjellberg and P. Wide (1990). 'Noise and annoyance in working environments.' Environment International 16: 555-559.

Swift, C. G., I. H. Flindell and C. G. Rice (1989). 'Annoyance and impulsivity judgements of environmental noises.' Proceedings of the Institute of Acoustics 11(5): 551-559.

ACM Computing Classification

ADNE/03: No more 404 [allocated]

It is annoying and bad for business for a visitor to a web site to receive 404 error messages. The objective of this project would be to develop a tool that would check all the links on any site. This would essentially amount to building a spanning tree of the pages. Of course a site generally includes external links, so that there is a danger of attempting to traverse the whole internet! The tool therefore should be limited to visiting only pages within a given domain or at most one external level.

The software would run on a client processor and load a root file from the server. It would accumulate all the (candidate) links on that page and then attempt to load them in turn. Any errors would be recorded. Links on successfully uploaded pages would be recursively checked.

The project would require programming in a suitable language (such as C or C++). It could be written to run on the platform of the student's choice (Unix/Linux, Windows or Macintosh). It will involve design, learning about Web software and protocols and design of a suitable user interface. The project is not particularly innovative; similar tools already exist (e.g. LinkChecker, ht://Check, A Forking Parallel Link Checker, Linkscan). It would suit a student who wants to extend their practical knowledge of web programming.

There are various libraries and languages that might be used. For a Unix implementation Python and Perl have libraries for web programming. Also the Department Apache web server (running on a Linux system) has the PHP web scripting language module installed.

Reference

Most books on web use talk of the importance of maintenance to avoid broken links, including...

Nielsen, J. (2000). Designing Web Usability: The Practice of Simplicity, New Riders.

ACM Computing Classification

ADNE/04: The technology and psychology of password protection [allocated]

One of the consequences of the increasing spread of IT- and of the need for security - is the requirement for people to have growing numbers of passwords and PINs. There are conflicting requirements of passwords:

Some systems make allowances for users forgetting passwords through a pre-determined question-response interaction. That is to say that the user gives the administration a question to which only they are likely to know the answer. Then, in the event of their forgetting their password, they can identify themself by giving the correct answer to the question. But the question often seems to be the same one ‘What was your mother’s maiden name?’ and the answer to that cannot be hard to find for the determined sneak. How secure is that?

Passwords composed of random characters are most unlikely to be guessed by others - but also least likely to be remembered. At the same time there is good evidence that people will remember passwords better if they have made them up themselves, rather than being given ones generated by a program. However, some systems attempt to get the best of both worlds by generating odd words with recognizable components (e.g. two real worlds concatenated together). How effective are all these strategies? Baddeley (1983) and Baddeley (1990) will give some guidance on memory and forgetting.

This project will look at the technology and psychology of password protection. How do cracking programs work? Are there classes of passwords which are easy to remember but hard to crack? Are there ways in which users can be encouraged to select more secure passwords? Adams, Sasse and Lunt (1997) have looked at passwords from a usability point-of-view, but they concentrated on certain aspects and left plenty of options for further work, any one of which the student might pursue.

Note that this project will not be used as an excuse to attempt to crack any real passwords. Any research on cracking should be undertaken in the spirit of ‘know thine enemy’. Any attempts to mis-use departmental or university computing resources within the project will be dealt with as appropriate.

Reference

Adams, A., M. A. Sasse and P. Lunt (1997). 'Making passwords secure and usable'. People and Computers XII: Proceedings of the HCI'97 Conference, Springer-Verlag.

Baddeley, A. (1983). Your Memory, a User's Guide. Harmondsworth, Middlesex, Penguin.

Baddeley, A. (1990). Human Memory: Theory and Practice. London, Lawrence Erlbaum Associates.

ACM Computing Classification

ADNE/05: Spoken text messages [IT, IP]

The telephone is a speech-based communication device and yet it has become the basis increasingly of textual communication through SMS text messages (for reasons which are being investigated in another student project by Mark Ocock). However SMS suffers from the very poor interface of 10 buttons for inputting 26+ characters (Byrne, 2001). If one could compose SMS messages in speech (using speech recognition software) would one get the best of both worlds?

Within this project, the student would simulate such a phone, using speech-to-text software (e.g. Dragon Point and Speak) and an on-line text messaging system (e.g. http://www.breathe.com/) to collect usability data. In some respects the project will be similar to that undertaken by Beatrice Khine (Khine, 2001)

One limitation of SMS is the length of messages (no more than 160 characters). This is one reason people use abbreviations (Mander, 2000) but it might be possible to combine the speech input software with something like Alex Shenton (2001), which automatically generates abbreviations.

References

Byrne, D. (2001). 'Learning of a linguistically optimised layout for text entry on a phone', University of York, Department of Computer Science, Third-year Project Report

Khine, B. (2001). An evaluation of human-interaction factors in deaf-hearing telephony, MSc(IP) Project Report, University of York, Department of Computer Science.

Mander, G. (2000). Want2Talk? ltle bk of txt msgs. London:, Michael O'Mara Books.

Shenton, A. (2001). Measuring comprehension of a text compression algorithm for SMS, Final year student project report, University of York, Department of Computer Science.

ACM Computing Classification

ADNE/06, ADNE/07: How many ways can you use one button? [CS, IT, IP]

Mobile, portable devices such as watches, phones and GPS receivers, are usually limited in the number of buttons that can be attached to them. It is impractical to have a full qwerty keyboard, and yet there is often a need to be able to input many and complex messages. There are a variety of ways that a single button can be used

Yanyu Li (2000) carried out a set of experiments in which she measured the time taken by a variety of people to perform each of these clicks. From her data it is possible to suggest time constraints that can be applied to characterize each kind of click (Edwards & Li, 2001). Tuffin (2002) built on that project by carrying out further experiments in which he tried to validate predictions about error rates based on Li’s data.

Both of these projects suffered from the fact that they were using software to collect the timing data. Such data cannot be accurate. It is proposed in this project to avoid that problem by using a hardware timer. A device has been constructed in the department which will communicate with the computer to allow the running of the same kinds of test – but the timing will be captured by the device and so be much more accurate.

Li's project attempted to be very general, using as many participants as possible of different ages. One conclusion was that very young (under 10) and older people (over 60) were much slower, in general, but that the middle population was reasonably homogeneous in timing. Therefore it makes sense to concentrate on the middle age group.

One project (ADNE/06) would therefore would essentially repeat Li's project, with the difference that the hardware timer and a narrower age-group would be used. The other project (ADNE/07) would look at whether the ergonomics of the button make a difference, by attaching different types of button to the timer.

The projects would include the writing of software to communicate with the timing device. (How this task would be shared between the students would have to be negotiated). Skeleton software has already been written in Visual C++ (Windows) that would have to be extended to present the data in an appropriate format. They would both involve the designing and carrying out of human-factors experiments.

References

Li, Y. (2000). 'Timing data for the use of a single button'. MSc(IP) Project Report, Department of Computer Science. York, University of York.

Edwards, A. D. N. and Li, Y. (2001). How many ways can you use one button? (unpublished)

Tuffin, J. (2002). 'Error rates and user perception in the operation of a single button', MSc(IP) Project Report, University of York, Department of Computer Science,

ACM Computing Classification

ADNE/08: What's in a name? [IT, IP]

One effect of the advent of the web is that it has great potential for advertising. Buying sufficient advertising space for a string of a dozen or so letters can potentially lead customers to an unlimited amount of information about your products. That is to say, that if the URL catches someone’s eye (perhaps on a billboard at a football match) and they can remember it and they can be bothered going to their computer and typing it in, then they have access to as much information as you care to make available.

Clearly there are a number of factors required to make this happen, particularly that the URL should capture their attention and be memorable. Given the limitations of URL syntax (Jarron Lanier has pointed out that they are really Unix pathnames, never intended or expected to be part of everyday life and advertising [1] ) it is very difficult to invest them with these properties.

The objective of this project would be to carry out experiments to assess the memorability of URLs.

It is envisaged that some kind of experiment will be devised in which participants will see pictures which include URLs (without being cued to note the URLs) and then tested for recognition and recall. Designing an ecologically valid experiment may prove challenging. Books such as Hutchins (1996), Robson (1993) and Robson (1994) may help. Nielsen (2000) gives some brief guidelines on URL design. Slides from a seminar on this topic are available.

An ability to use (or to learn to use) image processing tools, such as Photoshop might be useful (so that pictures of URLs can be grafted onto photographs for the experiments).

References

Hutchins, E. (1996). Cognition in the Wild. Cambridge, Massachusetts, MIT Press.

Nielsen, J. (2000). Designing Web Usability: The Practice of Simplicity, New Riders.

Robson, C. (1993). Real World Research: A Resource for Social Scientists and Practitioner-Researchers. Oxford, Blackwell.

Robson, C. (1994). Experiment, design and statistics in psychology. London, Penguin Books Ltd.

ACM Computing Classification

[1] It was somewhere on his homepage (http://www.well.com/user/jaron/), but I can’t search through it all to find the reference again

ADNE/09: Non-visual table access [allocated]

Tables use a spatial arrangement to facilitate access to certain types of information. It can be very difficult to access the same information if it is presented in a non-visual form (such as speech) as is required by blind people. For instance, most screen readers (Edwards, 1991) simply read lines horizontally, regardless of the columnar structure. Previous student projects have tackled this problem already. Bufton (1991) implemented a simple table browser, using Rich Text Format (RTF) files as input. Sinclare (1999) implemented a similar browser, but based on HTML files. Finally, van Kemenade (2000) designed a browser, using Mitsopoulos' design methodology (Mitsopoulos and Edwards, 1999) but he did not implement his design - and that would be the objective of this project.

The project would involve programming, including generation of speech and non-speech sounds. The implementation would have to be evaluated, probably by comparison with the earlier tools of Bufton and Sinclare as well as subjective evaluation by human testers.

References

Bufton, S. (1991). 'Reading text tables for blind people', Department of Computer Science, University of York, Final-year Project Report

Edwards, A. D. N. (1991). 'Speech Synthesis: Technology for disabled people'. London: Paul Chapman.

Mitsopoulos, E. N. and Edwards, A. D. N. (1999). 'A principled design methodology for auditory interaction'. in M. A. Sasse and C. Johnson (ed.), Proceedings of Interact 99, (Edinburgh), IOS Press. pp. 263-271.

Sinclare, J. (1999). 'Rendering HTML tables non-visually for blind people'. Department of Computer Science. York, University of York, Final-year Project Report.

van Kemenade, H. (2000). Application of a methodology for the design of non-visual tables. Department of Computer Science. York, University of York, Final-year Project Report.

ACM Computing Classification

H.5.1 Multimedia information systems: Audio input/output; Evaluation/methodology

H.5.2 User interfaces: Voice I/O

K.4.2 Social issues: Assistive technologies for persons with disabilities

ADNE/10: Guidelines for 'aural' cascading style sheets [IP]

The Web is a boon and increasingly a major source of information in everyday life. We are reaching a point when to not have access to the Web - for any reason - can be a significant handicap. One potential cause of exclusion is visual handicap, given that much of the design of web content is visually oriented (Edwards & Stevens, 1997). At the same time, legislation exists which effectively makes it illegal for information to be provided in inaccessible formats. (See the section on Access to goods, facilities and services of the Disability Discrimination Act). The Web community (embodied by the W3C organization) is aware of this tension and is taking steps to resolve the situation. One contribution is that the Cascading Style Sheets (CSS) feature of the web includes the option to mark up pages for 'aural' presentation as well as in other media. The question is, how to use this facility.

The first objective of this project would be to derive some guidelines on how to design aural CSS style sheets. Guidance would be available from the work of T V Raman on Cascaded speech style sheets (Raman, 1997) and Ian Pitt's work on the design of speech output (Pitt, 1998). The work project might also have something in common with that of Phillipa Woodcock (2001).

Even with the 'encouragement' of legislation, accessible web pages are unlikely to become widespread unless and until it becomes as easy to make accessible pages as inaccessible ones. There is thus a need for tools to help in the process. Ultimately it would be desirable to have tools which might take HTML pages and their associate (visual) style sheets and generate - or help to generate - an aural style sheet. It might be beyond the scope of a student project to also build such a tool, but some design requirements for a tool might be derived. Existing tools, such as those provided by WebAIM and those being developed by Omid Afzalalghom in a current project might be relevant.

References

Edwards, A. D. N. and Stevens, R. D. (1997). Visual dominance and the World-Wide Web. in Proceedings of the Sixth International
World Wide Web Conference (CD-Rom), (Santa Clara, California), Stanford University.

Pitt, I. J. (1998) T he Principled Design of Speech-Based Interfaces, DPhil Thesis, Department of Computer Science, University of York.

Raman T. V. (1997) Cascaded speech style sheets (in) Proceedings of the Sixth International World Wide Web Conference, (Santa Clara, California), pp. 109-119.

Woodcock (2001) Guidelines for ALT texts for auditory browsing, MSc(IP) Project, Department of Computer Science, University of York.

Keywords

Multimedia information systems: Audio input/output; Evaluation/methodology

User interfaces: Voice I/O

Social issues: Assistive technologies for persons with disabilities

Legend

IT: Suitable for ITBML third-year students.
CS: Suitable as a Computer Science project. Different versions of the project might be suitable for third-year BSc/BEng or MEng or as fourth-year MEng projects.
IP: Suitable for MSc(IP) students - usually involves HCI evaluation work.

20^th March 2002

Alistair Edwards' Project Suggestions 2002-3

Contents

ADNE/01: Rapid text correction [CS, IT, IP]

ADNE/02: Creating pleasing phone rings [CS, IT, IP]

ADNE/03: No more 404 [CS, IT, IP]

ADNE/04: The technology and psychology of password protection [CS, IT, IP]

ADNE/05: Spoken text messages [IT, IP]

ADNE/06, ADNE/07: How many ways can you use one button? [CS, IT, IP]

ADNE/08: What's in a name? [IP, IT]

ADNE/09: Non-visual table access [CS]

ADNE/10: Guidelines for 'aural' cascading style sheets [IP]

ADNE/01: Rapid text correction [CS, IT, IP]

References

ACM Computing Classification

ADNE/02: Creating pleasing phone rings [allocated]

References

ACM Computing Classification

ADNE/03: No more 404 [allocated]

Reference

ACM Computing Classification

ADNE/04: The technology and psychology of password protection [allocated]

Reference

ACM Computing Classification

ADNE/05: Spoken text messages [IT, IP]

References

ACM Computing Classification

ADNE/06, ADNE/07: How many ways can you use one button? [CS, IT, IP]

References

ACM Computing Classification

ADNE/08: What's in a name? [IT, IP]

References

ACM Computing Classification

ADNE/09: Non-visual table access [allocated]

References

ACM Computing Classification

ADNE/10: Guidelines for 'aural' cascading style sheets [IP]

References

Keywords

Legend