Testing the November Alfa Tango Oscar Spelling Alphabet (2.0*)

Alistair D N Edwards

Department of Computer Science
University of York
YO10 5GH


Spelling alphabets are used as a means of ensuring unambiguous voice transmission of letters. The best-known example is the Nato alphabet, Alfa, Bravo, Charlie… A necessary property of any such alphabet is that the words should be audibly distinguishable. This paper investigates a method for assessing the distinctiveness of each pair of letters. This is based upon expressing the phonetic pronunciation of the words (using the International Phonetic Alphabet) and then measuring the degree of similarity between the words. This is done by calculating the edit distance between the words, which is essentially the number of edit operations required to transform one word into the other.

This paper outlines the history of the development of the Nato alphabet and applies the edit-distance method to a number of candidate alphabets.

The main conclusions are that the Nato alphabet is quite a good one, but perhaps not as good as it might be. That is, at least, according to the method, but since the results of the analyses are not in close agreement with empirical tests in the literature, it may be a bit useless.

1. Introduction

Voice-based communications can suffer from intelligibility problems. Where the elimination of ambiguity is important, then words may be spelled out. Verbally spelling, though, can itself be unreliable; letters spoken in the conventional way ('Ay', 'Bee', 'Sea' etc) can sound similar and be mis-heard and confused. One approach to mitigating this is the use of spelling alphabets, whereby longer onomatopoeic words are used to represent the letters (e.g. 'Alpha', 'Bravo', 'Charlie' - see below)1. Where the identity of a particular word or its spelling is important in the communication, it can be spelled out using this alphabet (e.g. 'Echo, Delta, Whiskey, Alpha, Romeo, Delta, Sierra'). There are also instances when letters not words need to be communicated, and often it is vital that there is no ambiguity. For instance, a callsign may just be letters, or a car registration may need to be passed on.

There are a number of properties that are desirable in a spelling alphabet, including:

  1. The words should be audibly as different from each other as possible.
  2. The words should be long (i.e. longer than the monosyllabic 'Ay Bee Sea'), but not so long as to slow down the communication any more than necessary.
  3. They should be easy to remember, with an obvious mapping to the letter they represent.

This paper concentrates on property (1), above. Can we measure how well different letters can be distinguished? The approach taken has been to convert the words into a phonetic form and then to measure the degree of difference between each of the pairs of words.

The paper investigates a number of spelling alphabets, with particular interest in the Nato2 Phonetic Alphabet, which is the international standard for this purpose.

I could write this paper to a publishable standard – and then perhaps try to get it published. However, I cannot be bothered; I cannot be bothered to click much further than Wikipedia for references, for instance. Nevertheless, I hope readers find it interesting and useful.

2. Background

2.1 Letters

Letters of the Latin alphabet have common, onomatopoeic names, 'Ay', 'Bee', 'Sea' etc. These have an obvious (but not necessarily direct) mapping to their phonetic values in written text. They are short sounds, which is appropriate for most purposes. However, they can sound quite similar. Take, for instance, the vowels. The nature of a vowel is that it is a sound made with an open, unconstricted vocal tract and therefore their letter values ('Ay', 'Ee', 'Eye', 'Oh' and 'Yew') sound quite similar. Indeed, they are to some extent inter-changeable. The common greeting can be written (and pronounced as) 'hello', 'hallo', 'hullo' and even 'hillo'. One of the obvious variations in different spoken accents is in the pronunciation of the vowels. At an acoustic level the waveforms of the vowels are quite similar, as shown in Figure 1.

Waveforms of the vowels, A, E, I, O, U.
Figure 1. Waveforms of the vowels. Note that their outlines are quite similar. Click on the image if you want to hear the vowels.

Notice the attack portions of the waveforms (the onset of the sound, on the left) are quite similar and abrupt, while the decay portions (beyond the maxima) are also similar, triangular shapes. Contrast with the waveforms of the letters A, C, G, H, P in Figure 2.

Waveforms of the letters A, C, G, H, P.
Figure 2. Waveforms of the letters A, C, G, H and P. It is apparent how different the shapes of these waveforms are, and hence they are more easily distinguished audibly.

It should now be apparent why the names of the letters are easily confused audibly. In everyday communication – particularly face-to-face – this is rarely a problem. Indeed, face-to-face dialogue is not purely auditory, there is a visual component. Even hearing people can and do use a degree of lip-reading. Whereas the sounds of the letters B and V might easily be confused, if the listener is watching the lips of the speaker they are quite likely to see the difference. When speech takes place via technology (telephone, radio etc) then it does become purely auditory and confusions may arise, hence the need for spelling alphabets.

2.2 Spelling alphabets

With the advent of speech technologies, first the (wired) telephone and then later what was known as radiotelephony, the need for spelling alphabets became apparent. These could be informal, even made up spontaneously by the speaker, but organizations also adopted standardized alphabets. So it was that the International Civil Aviation Organization (ICAO) published a report in 1959 (ICAO, 19593) which counted and lists no fewer than 203 different spelling alphabets. That report is a comprehensive read for anyone interested in the history of the development of these alphabets. It also describes how the ICAO alphabet was designed, using experiments based on recordings of speech in noise to measure intelligiblity and ambiguity.

ICAO (1959) is entitled The evolution and rationale of the ICAO word-spelling alphabet, and traces the development which led to the recommendation of the ICAO alphabet as an optimum list for international communications (p.ii). The paper states that the first International Telecommunication Union alphabet was adopted in 1926. (Figure ITU1). It is interesting that these words are all place names. The report states, though, that '[O]perating experience has indicated that the words were unsuitable because the were unusual in everyday language and because they lacked desirable phonetic qualities' (p.9). It goes on to suggest that there was a new era of development in international alphabets with the entrance of the United States into World War II. Then a number of alphabets were examined. Many were found wanting. 'The results showed that many of the words in the military lists had a low level of intelligibility, but that most of the deficiencies could be remedied by the judicious selection of words from the commercial codes and those tested by the laboratory.' (p.9) This resulted in the National Defense Reseach Committee (NDRC) list. This is listed in ICAO (1959), but unfortunately the quality of the published scan of that paper is so low that it is unintelligible.

Desirable properties for a spelling alphabet were listed in the Introduction, above, and we can look at them in more detail and extend them here.

1. The words should be audibly as different from each other as possible.
That is the main topic of this paper, so read on. In radiotelephony it is also important that they should not resemble any other standard words and phrases, such as 'Roger', 'Over' and 'Negative' – and certainly not 'Mayday' – nor any of the digits. There is a further requirement that does not seem to be mentioned in any of the papers. The use of a word as a spelling alphabet should not be confused with that word being used in its everyday meaning.
2. The words should be long, but not so long as to slow down the communication any more than necessary.
The essential problem with the normal letter sounds is a lack of redundancy. If a longer sound is used then if part of it is unheard or masked it may still be identified. For instance, in the Nato alphabet that we will be investigating in more detail below, if the initial G sound in 'Golf' is missed or misheard there is no other letter which ends in the 'olf' sound with which it could be confused. Words may even have more than one syllable (e.g. 'Foxtrot'). To choose even longer words introduces even more redundancy, but may also slow down communication.
3. They should be easy to remember, with an obvious mapping to the letter they represent.
This is generally achieved by choosing words which start with the letter that they represent.
4. They should be pronounceable to speakers whose first language is not English.
English is the official language of radio communication and thus any spelling alphabet will be used by people whose native language is not English. While pronunciations may vary (see Accents) the words should not be difficult for non-English-speakers to say. The ICAO Spelling Alphabet (precursor to the Nato Alphabet) was tested with regard to the three main languages used in Nato member countries: English, French and Spanish. ICAO (1959) suggests that for internationalization each word in the alphabet should be 'live' in each of the three working language, although it is not clear what 'live' means here.
5. They should not have any meaning that might be found offensive.
Further to point (4), the words should not be suggestive of any offensive meanings in any common language.

The choice of phonetically distinguishable words is not easy because of the number of combinations. Nato memo suggests

It is known that [the ICAO spelling alphabet] has been prepared only after the most exhaustive tests on a scientific basis by several nations. One of the firmest conclusions reached was that it was not practical to make an isolated change to clear confusion between one pair of letters. To change one word involves reconsideration of the whole alphabet to ensure that the change proposed to clear one confusion does not itself introduce others.

Or, as ICAO (1955, pp.14-15) puts it, rather more graphically:

The problem is not unlike that of pushing a dent out of a child's celluloid ball – even a successful push leaves a small dent in another place.

This paper traces the history of some of the development of spelling alphabets, testing them along the way. It will be seen that the ICAO was responsible for the evolution of the standard alphabet, which was adopted as the standard alphabet for Nato (the so-called Nato Phonetic Alphabet) in 1955.

2.3 Phonetics

Phonetics is concerned with the sounds of words. While there are 26 letters in the Latin alphabet, there are a lot more sounds than that in English words. The basic unit of sound is the phoneme. A simple definition of a phoneme is a unit of a word which it if were replaced by another phoneme the meaning of the word would be changed. For instance, if one replaces the 't' sound at the beginning of the word 'toffee' with a 'k' sound then the word would become 'coffee'. It is generally accepted that there are 44 phonemes in English.

It is evident that the 26 letters of the alphabet are insufficient to unambiguously represent 44 phonemic sounds. This is partly accounted for by the use of pairs of letters to represent additional sounds. Examples are 'sh', 'ee', and 'th'. There are also other spelling conventions which give clues to pronunciation. However, English is notoriously unphonetic and irregular in its rules4. The Latin alphabet is thus quite inadequate to unambiguously represent the sounds of all (English) words. Futhermore there are many other phonemes which are found in languages other than English. The International Phonetic Alphabet (IPA) has been devised with the objective of containing one symbol to represent every phoneme in natural languages. There are 107 segmental letters in the IPA, and symbols had to be invented to represent that number. Many of the symbols resemble letters in conventional alphabets (e.g. a, b and c) – and variations thereon (such as upside-down letters ɐ, ə, ʌ). A guide to IPA symbol pronunciation can be found in Appendix A.

In addition to the segmental letters there are suprasegmentals. These can mark stress (e.g. ˈ to mark the primary stress). The symbol ː indicates lengthening of the previous letter sound, and syllable boundaries can also be marked. Stress is considered important in the use of spelling alphabets. For instance ICAO (2001) spells out the expected stress patterns (e.g. AL FAH, HO TELL, JEW LEE ETT). However, the stress markings on the IPA string cannot be considered to be part of the spelling, as such. For one thing they do not affect the sound of the pronunciation. Secondly, their placement cannot be fairly measured using edit distances, as described below.

It is thus possible to translate (the sounds of) any word into its (IPA) phonetic representation. Most of the IPA translations used in this research were provided by the website ToPhonetics.com.

2.4 Edit distances

A collection of characters (such as a word) can be referred to in the abstract as a string. It is possible to calculate the degree of difference between two strings, by measuring the edit distance. The edit distance between two strings is really a count of the number of single-character editing operations (insertions, deletions or substitutions) that would be required to transform one string into the other. Taking the earlier example, the edit distance between toffee and coffee is 1, since it takes just one substitution to make that transformation.

As described below, the basis of this work was measuring the edit distances between phonetic representations of spelling alphabet letters. In this case the measure is the Levenshtein Distance. (Figure 3). This measure was chosen because it works for strings of different lengths.

                    function LevenshteinDistance(char s[1..m], char t[1..n]):
                        // for all i and j, d[i,j] will hold the Levenshtein distance between
                        // the first i characters of s and the first j characters of t
                        declare int d[0..m, 0..n]

                        set each element in d to zero

                        // source prefixes can be transformed into empty string by
                        // dropping all characters
                        for i from 1 to m:
                            d[i, 0] := i

                        // target prefixes can be reached from empty source prefix
                        // by inserting every character
                        for j from 1 to n:
                            d[0, j] := j

                        for j from 1 to n:
                            for i from 1 to m:
                                if s[i] = t[j]:
                                    substitutionCost := 0
                                    substitutionCost := 1

                        d[i, j] := minimum(d[i-1, j] + 1,                   // deletion
                                            d[i, j-1] + 1,                   // insertion
                                            d[i-1, j-1] + substitutionCost)  // substitution

                        return d[m, n]
Figure 3 Calculating the Levenshtein Distance  This is pseudo-code not in any implemented programming language and taken directly from Wikipedia.

It would not be appropriate to include stress marks in the measurement of the edit distance because the stress marks on two different words may be very separated (on the first syllable, as is the default in English, but on the last syllable for another); Including the edit distance between these marks in the overall edit distance would give a distorted and artificially great overall edit distance.

A longer explanation of edit distances can be found in Appendix B.

2.5 Accents

The precise sound of any spoken word depends on the accent of the speaker. By extension the translation of an English spelling to an IPA representation will represent an assumption as to the accent of the 'speaker'. The website toPhonetics.com offers two accents: 'American' and 'British'. It is, of course, fallacious to suggest that there is just one American and one British accent5, but given that most of the development of the ICAO and subsequent Nato alphabets took place in the USA, the American accent has been used in most of these experiments.

However, given that the alphabets are meant to be international it is not only the accents of English-speaking Americans and Britons that are to be accommodated. ICAO (1959) refers to the principal non-English-language-speakers to be accommodated as French and Spanish. However, given that English is the official language of radiotelephony, the alphabet ought to be robust to speakers of practically every spoken language.

2.6 X-Words

One problem with English spelling words is that there are very few which start with the letter X. Furthermore, for most of those that do it is sounded as a Z. As we will see, many of the alphabets get around this by using the word X-ray.

Having filled in all this background, we can go on to explain this little set of experiments.

3. Method

The idea behind this study was to take words from spelling alphabets, translate them into phonetic representations, to measure the edit distances between each pair of letters and then to identify where the weaknesses were. Specifically, it was interesting to see whether this analysis would support any suggestion that the Nato alphabet is optimal. Given that the more recent alphabets (including the Nato one) have been developed on the basis of empirical experiments it is also interesting to evaluate the method: do the results of these analyses concur with the results of those experiments?

As suggested above, the online toPhonetics tool was used to convert English spellings of the spelling alphabet letters to IPA. Then the edit distances between each pair of letters was calculated6. Some aspects of the IPA translation were ignored.

3.1 Metrics

The edit distance indicates the degree of similarity between two strings: a short edit distance implies that the strings are more similar, so in this context a greater distance is a desirable property. For instance, if the edit distance is as low as 1, then confusing just one phoneme for another could cause the confusion of the two words.

It would be desirable to come up with a simple metric, perhaps a single number, by which different alphabets could be compared. The mean edit distance between each of the pairs of letters might seem to be such a number. However, this is not really practical because a mean tends to even out outliers. That is to say, for instance, that an alphabet with a large number of short edit distances (perhaps a lot of 1s) might also have a lot of long distances (e.g. 9s), with the result that its overall mean is quite reasonable. In other words, this would obscure the prevalence of short distances. The pairs with distances equal to 9 would be good, hard to confuse, but there might be many instances of the low-scoring pairs being confused.

It is appropriate, therefore, to concentrate on the low-scoring pairs. A simple rule-of-thumb would be to reject any alphabet in which any pairs have a distance of 1. One might also be cautious of any with large numbers of short distances (perhaps 2). The modal average for any letter will also give an indication of the robustness of that particular letter.

4. The Experiments

As noted elsewhere, there has been a plethora of spelling alphabets. It was decided, therefore, to pick a small number of them that seem to have been historically significant, leading up to the adoption of the Nato alphabet, to see how they evolved.

4.1 ITU

As mentioned earlier, the first international alphabet was adopted by the International Telecommunications Union (ITU, See Figure 4). ICAO (1959) states (p.9) that, '[O]perating experience has indicated that the words were unsuitable because they were unusual in everyday language and because they lacked desirable phonetic qualities'. It does not state what those (lacking) phonetic qualities are. This is a shame, because the words are long and multisyllabic (mean length of the Latin spelling is 7.58 and the IPA transcription is 7.77), and hence have a lot of redundancy, so that one might assume they are well distinguished, and this seems to be confirmed in Figure 4. The maximum length is 9 and as many as 26 pairs have this score (marked in green in the Figure), implying pairs which are most unlikely to be confused. At the same time the minimum length is as large as 5 and there are only three pairs that close.

Xanthippe was the wife of Socrates. It was probably a poor choice of word since few people would know how to pronounce it (zænˈθɪpi) and it starts with a z sound. (See X-Words).

toʊkioʊ juræ
AAmsterdamæmstərdæm 77881099987888889989978888
BBaltimorebɔltəmɔr7 7881078797897868888878887
CCanadakænədə77 66996686776775777888688
DDanemarkdɛnmɑrk886 61087797797687787878798
EEddistonɛdisʌn8866 9966988107886777888898
FFranciscofrænsɪskoʊ10109109 10101091010891091010810710108109
GGibraltarʤɪbrɔltər9798910 89767108888799968999
HHawaiihəwaɪi98676108 5888107786786887898
IItalieɪtəli976761095 887107885786888897
JJerusalemʤərusələm898999788 8796968999888988
KKimberleykɪmbərli77678106888 697876798868788
LLiverpoollɪvərpul887781078776 98778698878898
MMadagascarmædəgæskər8979108101010999 8108101091091010999
NNewcastlenukæsəl8767798776788 877787678887
OOntarioɑntɛrioʊ8876810878987108 87777868898
PPortugalpɔrʧəgəl867889888677878 8898787888
QQuebeckwəbɛk985761086586810778 797887888
RRivolirɪvoʊli9877710777976107787 86868877
SSantiagosæntiɑgoʊ887878988999987998 8989799
TTokiotoʊkioʊ987771096698810778768 878868
UUraguayjurægwaɪ98888798888896878898 88887
VVictoriavɪktɔriə77878106888671076886878 8898
WWashingtonwɑʃɪŋtən888881087888810887789888 898
XXanthippezænˈθɪpi88678898897898888878888 97
YYokohamajoʊkəhɑmə8889910999889989887968999 9

Figure 4 The ITU alphabet.

Overall mean = 8.32. Largest distance = 10 (25 occurrences). Shortest distance = 5 (3 occurrences)

Cells coloured green are those with the maximum length in this table (9 in this case) and the red ones are those with the mimimum length in the table (5). These are the ones which might be cause for concern as the ones most likely to be confused. In this case, though, a minimum distance of 5 is not really a concern; we will see many examples below of alphabets with shorter minima. It is notable, but not surprising, that the words are quite long.

If you hover over a cell in the table, you can see its column and row identity.

4.2 RAF

It is evident that a lot of the early spelling alphabets were quite ad hoc, not devised with the scientific approach that we will see below went into the development of what became the Nato alphabet. ICAO (1959) lists as many as 250 alphabets. The Royal Air Force (RAF) used at least two, and they were different from those used by other British forces. The alphabet used by the Royal Air Force in 1921-42 was as in Figure 5.

eɪs bɪə ʧɑli dɒn ɛdwəd frɛdi ʤɔʤ hæri ɪŋk ʤɒnikɪŋ lʌndən mʌŋki nʌts ɒrɪnʤ pɪp kwin rɒbət ʃʊgə tɒkʌŋkl vɪk wɪljəm ɛksreɪ jɔkə zibrə
AAceeɪs  2435534342654524543425545
BBeerbɪə 2 435534342654524543425645
CCharlieʧɑli 44 45543434554544544445645
DDondɒn 334 5534323554434442436645
EEdwardɛdwəd 5555 555555555555445556545
FFreddiefrɛdi 55555 54555545455555556655
GGeorgeʤɔʤ 334355 4333654534543436635
HHarryhæri 4434544 434654444544446645
IInkɪŋk 33435534 42654534542226535
JJohnnieʤɒni443255334 4554444443446645
KKingkɪŋ 2243553424 644523543325645
LLondonlʌndən 66555566656 55565666665665
MMonkeymʌŋki 555554555545 4555555356655
NNutsnʌts 4444554444454 544544446545
OOrangeɒrɪnʤ 55545454545555 54554555655
PPippɪp 224355343426545 4543425645
QQueenkwin 4444554444355444 544445644
RRobertrɒbət 55544555545655555 44556644
SSugarʃʊgə 444445444446545444 4446635
TToctɒk3342553423365443444 326535
UUncleʌŋkl 44445544243634544543 35535
VVicvɪk 224355342426545245423 5535
WWilliamwɪljəm 5556666666556655566655 665
XX-rayɛksreɪ 56665666566665666665556 55
YYorkerjɔkə 444445343446545444333365 5

Figure 5. An RAF alphabet.

Overall mean = 4.97. Largest distance = 6 (42 occurrences). Shortest distance = 2 (17 occurrences)

Evidently this was quite a poor alphabet. There are no fewer than eleven pairs with an edit distance of just 2.

It is notable that the words are short: the mean number of letters in the English spellings is 4.90 and their IPA spellings 4.12. Shorter words have lower redundancy so it is no surprise that there are a large number of potential clashes.

4.3 ICAO

According to ICAO (1959), during World War II there was some of the almost-inevitable nationalistic and political wrangling over the adoption of a standard international alphabet. '[T]here still remained several words on which neither the US nor the British side would yield. Therefore, the Generals and Admirals went down taking first a US and then a UK preference to complete the list and get on with the war.' (p. 10). With the end of the war, though, there was the realization of the need for a standard alphabet for use in aviation. The International Civil Aviation Organization (ICAO) was established in 1944 and in 1946 it agreed on an international alphabet, the Combined Services Alphabet. The report is inconsistent regarding dates, but it seems that the agreed alphabet was that shown in Figure 6.

The words in this alphabet are shorter than the ITU alphabet (mean IPA string length 4.42 versus 7.77) so there is less redundancy. Consequently the maximum distance of 6 is much less than the 10 for the ITU and the overall mean of 5.06 is much less than the 8.32 for the ITU alphabet. Hence, in terms of confusibility this alphabet is somewhat worse than for ITU.

AAbleeɪbəl 4555555344542445445355654
BBakerbeɪkər4 666666566644666666535655
CCharlieʧɑrli56 55445555555555454565655
DDogdɔg565 3433523345554543566645
EEasyizi5653 443533345543553566643
FFoxfɑks56444 44544444554454456445
GGeorgeʤɔrʤ564344 4534445554453566645
HHowhaʊ5653344 433335454543566635
IItemaɪtəm35555554 44533535445445655
JJigʤɪg465234334 2344554543555645
KKingkɪŋ4653344342 344553553455645
LLovelʌv56533443533 45554553456645
MMikemaɪk445444433444 3554554445535
NNickelnɪkəl2455545534453 545445245545
OOboeoʊboʊ46555554555555 55545566634
PPeterpitər465545553555545 4335456554
QQueenkwin5654344454344554 554565644
RRogerrɑʤər46455445455554535 34466555
SSugarʃʊgər465455544455544353 5466545
TTaretɛr5643343353334555445 566545
UUncleʌŋkəl35555455454442545445 56545
VVictorvɪktər536665664555446566665 4655
WWilliamwɪljəm5556666655565566566664 665
XX-rayɛksreɪ66666466666655656555566 55
YYokejoʊk555444435444343545444565 5

Figure 6. The ICAO alphabet adopted in 1946.

Overall mean = 5.06. Largest distance = 6 (53 occurrences). Shortest distance = 2 (4 occurrences)

Internationalization remained a problem. Spanish-speaking representatives stated that this alphabet was not suitable for Spanish speakers. A separate alphabet was thus agreed for this group. Unfortunately, due to the poor quality scan of ICAO (1959) the Spanish alphabet is illegible.

It is notable that some of the letter names are spelt in an unconventional way. Alfa is an example, and presumably this is to overcome any ambiguity in the pronunciation; would non-Native English speakers understand the conventional pronunciation of 'ph' in 'Alpha'?6

In the next few years Prof Paul Vinay was commissioned to develop an alphabet, 'according to logical linguistic principles to be acceptable to international users' (ICAO, 1959, p.11). This resulted in the alphabet in Figure 7. The overall mean for this is 5.26 and the maximum is 8.

ælfə beɪtə koʊkə dɛltə ɛkoʊ fɑks
gɑlf hoʊtɛl ɪndiə ʤuliɛtə kɪloʊ laɪmə mɛtroʊ nɛktər ɔskər poʊlkɑ kwəbɛk roʊmioʊ siɛrə tæŋgoʊ junjən vɪktər wɪski ɛkstrə jæŋkizulu
AAlfaælfə  5554736575466465756665654
BBetabeɪtə 5 435855475354566746545555
CCocakoʊkə 54 42754474465445546554645
DDeltadɛltə 534 5845464443566746545554
E Echoɛkoʊ 4 5 2 5  8 44574565545556665454
F Foxtrotfɑkstrɑt 7 8 7 8 8  7 8 8 7 8 8 8 6 778888867578
GGolfgɑlf 355447 6564566566756665653
H Hotelhoʊtɛl 6 5 45486 665665645566656566
IIndiaɪndiə 54445856 65465566746554655
JJuliettaʤuliɛtə 777677666 6777766777 6 7 6 7 6 5
KKilokɪloʊ 5544484556 566555556654654
LLimalaɪmə 43445856475 65566646 55 5 655
M Metromɛtroʊ 6 5 6 4 68666766  5 5 6 6 6 5 4 6 6 6 5 66
NNectarnɛktər 64 53 56 65 57 6 55  56 67 56 5 2 54 56
OOscarɔskər 45455756575555 65746654555
PPolkapoʊlkɑ 664647646656666 6566666666
QQuebeckwəbɛk 5656586566566656 766666666
RRomeoroʊmioʊ 77575875775667757 77776767
SSierrasiɛrə 544458564754554667 6555655
TTangotæŋgoʊ 6666686667664666676 666646
UUnionjunjən 65 55 6 8 66 5 66 56 56 66 75 6 5 66 5 5
VVictorvɪktər 64 54 66 65 57 55 6 2 56 67 56 5  4 556
WWhiskeywɪski 5545575646456546665664 535
XeXtraɛkstrə 65 65 4565676654566766655 66
YYankeyjæŋki554557565655655666545536 5
ZZuluzulu 45 54 4 8 36 5 54 56 65 66 75 6 56565 
Circles 1 2 1 2 2 1 1 3 1 3 2
score (%)
87.5 91.2 76.1 84.2 62.8 85.3 81.4 81.3 74.2 84.4 85.6 52.9 67.8 65.5 88.7 82.7 76.9 95.0 80.5 70.3 71.7 80.9 72.7 88.8 75.5
Confusion score 90 75 182 145 340 190 54 179 174 142 ??? 117 167 178 214 123 119 54 100 60 247 157 142 361 80 176

Figure 7.  Edit distances for all of the pairs of letters in the ICAO 1949 alphabet.

Overall mean = 5.26. Largest distance = 8 (14 occurrences). Shortest distance = 2 (2 occurrences)

Also shown is data from Figure 2 of ICAO (1959), which recorded the number of times a letter was heard when a letter was spoken (against noise). In most instances it should be expected that when (say) Echo was spoken then Echo was heard, but (for instance) in fact in 32 cases the listener thought they had heard Hotel.

The paper also gives an 'articulation count', the percentage of times that the correct letter was heard. Note that due to the poor quality of the scan of the paper some of the numbers in this column are illegible and have been omitted here, and some are hard to read so the best guess as to their value has been given. That a letter should only be correctly heard as little as 66% or less of the times it has been spoken would seem to be a matter for concern, so values lower than that are highlighted in red.

It also gives a 'confusability score', which is the number of times the word is heard when another is spoke.

In the paper a number of cells were circled as causing concern, if their value was above a chosen but arbitrary value, and the corresponding cells are grey in this diagram. A count of the number of such cells, 'Circles' is given here, because the larger this number the more fragile that letter is. Cells with the largest Circles values (3) are highlighted in red.

See Comparison with ICAO Experiments, below for more discussion of the experiments.

Given the motivation that the alphabet should be truly international, it made sense to see whether using the 'British' accent of toPhonetics made any difference. (Figure 8).

AAlfaælfə 3554735584365356656655654
BBetabitə3 554845583255356655655654
CCocakəʊkə55 42754484554435555544645
DDeltadɛltə554 5845474542536755535554
EEchoɛkəʊ4425 844584464355566655454
FFoxtrotfɒkstrɒt78788 78888887768888877578
GGolfgɒlf345447 6584465436766655653
HHotelhəʊtɛl5545486 675565565566656566
IIndiaɪndɪə55445856 75554545755545655
JJuliettadʒulietə888788877 8888888888887877
KKilokiləʊ4344484558 365445556655654
LLimalimə32554845583 55356656655654
MMetromɛtrəʊ655468665865 4656654556466
NNectarnɛktə5542475548554 446755525455
OOscarɒskə33453745584364 45656644554
PPolkapɒlkə553356364845544 6755544644
QQuebeckwɪbɛk6656586558566656 766665666
RRomeorəʊmɪəʊ66575875785667677 67777677
SSierrasɪeərə555568665855555566 6645466
TTangotæŋgəʊ6555686658664565676 556646
UUnionjunjən66556866586655656765 56655
VVictorvɪktə554357554855524467455 4555
WWhiskeywɪski5545575657556544575664 535
XeXtraɛkstrə66654565686644566646655 66
YYankeyjæŋki554557565755655467645536 5

Figure 8. Edit distances of the ICAO alphabet (Figure 7) but when pronounced with a 'British' accent.

Overall mean = 5.22. Largest distance = 8 (33 occurrences). Shortest distance = 2 (4 occurrences)

There are no substantial differences from the 'American' version.

It is evident that the change of accent has not had a large effect. It is interesting that Julietta shows a higher degree of robustness in the British accent; it has a maximum distance (8) from 19 other letters. The IPA representation is the same as for the American accent, so it is the pronunciation of the other letters which varies.

Hereafter in this paper the American accent (or toPhonetic's idea thereof) is used in all the examples.

Comparison with ICAO Experiments

There were extensive experiments carried out to compare the ICAO 1946 alphabet with the ICAO alphabet which concluded that the ICAO was superior. Nevertheless, efforts were expended to attempt to improve he ICAO alphabet. One of the objectives was to eliminate the 'confusable' words.

Figure 2 of ICAO (1959) shows how each letter word was spoken a number of times – with noise and the letter that the listener thought they had heard was counted. There are two problems with this figure. One is that it is not stated how many times each letter was spoken. The second problem is the poor quality of the scan of the paper, so that some of the numbers in this matrix are illegible. In other words, were it not for the latter problem it would be possible to count the former number. An 'articulation score' is given, which is the percentage of times that the word was correctly identified, and a 'confusability score', which is the number of times the word is heard when another is spoken. The ideal is thus a word with a high articulation score and a low confusability score.

Note that unlike the figures in this paper, this matrix is not symmetric. This is because it distinguishes between times when (for instance) the word Echo was spoken, but Hotel heard, and the number of times that Hotel was spoken and Echo heard. This contrasts with the phonetic scores used in this paper, where the edit distance between Echo and Hotel is the same as that between Hotel and Echo.

A number of cells in the table are circled which indicate clashes of concern, although the value chosen as a matter for concern is said to be 'arbitrary'. The corresponding cells in Figure 7 are highlighted in grey and the number of grey cells for each letter given ('Circled'). Also transferred are the articulation score and the confusability score.

There are a few points to note about this experiment and these results. Firstly, it is probably worthwhile quoting what the paper says about the method.

Speakers representing the NATO nationalities spoke a number of lists of random three-letter code groups using the two alphabets to be investigated, and tape recordings were made of these lists. Two sets of list: were prepared, one for training purposes to acquaint the listeners with the words of each alphabet, noise interference, and foreign dialects of the speakers, while another set was made for the actual test condition. After approximately twelve hours of listening practice to ensure complete familiarity with both alphabets, the experimental subjects, foreign and American, listened to the test lists under three prescribed conditions of noise interference. Two types of noise generators were employed to introduce interference in the listening lines, end the speech level was attenuated to achieve progressively more difficult reception conditions. The resultant scores were compared end analyzed for differences.

There are a number of points to notice in this description of the method, often relating to omissions and unclear descriptions. Notably:

How many Nato nationalities were represented in the speakers?
Presumably native speakers of English (American and/or British?), French and Spanish. Equal numbers of each?
How many participants took part?
How many sets of letters did they hear in the study?
Twelve hours of 'practice' is mentioned, but was that followed by a similar duration for the second list, the experimental condition? How were those 12 hours distributed? Presumably there were rest periods – otherwise there would surely have been a fatigue effect.
Three different 'conditions of noise interference' are mentioned, as well as two types of noise generator – and attenuation of the speech level.
In other words, the conditions were (very) different in different cases. There were potentially 6 different noise conditions and an unstated number of levels of speech – plus the number of different accents of the speakers. Presumably this was intended to simulate the variety of difficult conditions that might be encountered in the field, but it does raise a lot of questions. Presumably the results under all of the conditions have been pooled. Given the above assumptions that may be reasonable, but it might have been informative to see separate analyses of the different conditions; not the least this might have given some justification of the choice of conditions used in the experiments. Did every participant hear examples under the same sets of conditions?
A 'number of lists of random three letter code groups'
What number (as above)? Although the lists were random, did each participant hear the same lists? Why three-letter groups? (Might there have been proximity effects, e.g. the word Alfa is clearer when heard after Foxtrot than after Beta?)
Minimal statistics are presented
This exacerbates some of the above criticisms. For instance, not knowing how many times letters were tested it is not possible to make comparisons. One comparison it was possible to make is to calculate the mean Articulation Score, but this does not appear in the paper. There is mention that, 'The performance is statistically significant at a confidence level of one-tenth of one per cent'. (op cit. p.12) However, no details are provided as to how this was measured and hence what 0.1% means.

Given that our results seem at variance with those in the paper, the above list might be seen as criticisms, along the lines of 'their method was poor, so ours was better', but that is not the intention. On the contrary, given the divergence in results it would be good to know more details of their method so that we can see in what ways what ICAO was measuring was different from what we have been measuring and then we could see (or at least hypothesize) as to why the results are different.

Putting these questions aside, the pairs identified as problematic are shown in Table 1, along with their edit distances.

Letter spoken Letter heard Letter spoken Letter heard Edit distance
1 Echo Hotel     4
2 Foxtrot Oscar 7
3 Hotel Coca 4
4 Julietta Union Union Julietta 6
5 Julietta Zulu Zulu Julietta 5
6 Lima Union Union Lima 5
7 Metro Echo     6
8 Nectar 5
9 eXtra 5
10 Nectar Lima 5
11 eXtra 4
12 Union Zulu 5
13 Victor eXtra 5
14 eXtra Echo 4

Table 1.  Letter pairs which were circled in Figure 2 of ICAO (1959), along with their edit distances. Notice that some pairs are symmetrical (e.g. both Julietta/Union and Union/Julietta were circled. The letters which were subsequently replaced are highlighted in grey.

It was decided to replace Coca, Metro and eXtra with Charlie, Mike and X-ray. The latter three had shown 'above-average' performance in earlier tests. Furthermore, in the opinion of the project's foreign-language consultant they were 'phonemically adapted to NATO users. (ICAO, 1959, p.14).

The report says of the first three that they showed below-average performance and, 'in the opinion of the project's foreign language consultant that they were phonemically adapted to Nato users' (p.14). No explanation as to why the latter three were selected for substitution is given, but they were replaced, as in Table 1

It is notable in this figure that only one of the letters with a short edit distance is in this list. That is to say that Coca has an edit distance of 2 with Echo. Otherwise none of the identified letters seems to have any problems of small edit distances. There are no cells in Figure 7 which ought to be both grey and red.

This would seem to suggest that the phonetic edit distance method proposed in this paper is useless, but let's carry on anyway.

It may also have been thought that eXtra was not suitable since it breaks the convention of starting with the indicated letter (i.e. 'E', not 'X').

Noting the apparent problems with these letters, the next alphabet tested was as in Figure 9. The substitutions were as in Table 2.

ICAO 1949ICAO 1952

Table 2  Substitutions made in refining the ICAO alphabet.

The edit distances for the revised alphabet are shown in Figure 9. The first thing that is evident is that the minimum score is still 2; the possible confusion between Coca and Echo has been eliminated by replacing Coca with Charlie. The replacement of Foxtrot by Football seems less successful, the mode score of the former being 7 and the latter 5. Conversely, the modal score for the new word Uniform is 7, rather better than Union (5). The modal scores for Zulu and Zebra are the same (4), although Zebra has a minimum score of 2, with Sierra.

AAlfaælfə 5443435464335434645 754544
BBravobrɑvoʊ5 455555565555555554755554
CCharlieʧɑrli44 44535454445435545754544
DDeltadɛltə454 4434453443445645644444
EEchoɛkoʊ3544 433463434434445 754344
FFootballfʊtbɔl45544 54565555554655755555
GGolfgɑlf353335 5463435425645754544
HHotelhoʊtɛl5554345 564554554455 745555
IIndiaɪndiə45444545 54444445645654544
JJuliettdʒuliet665566665 6666666566666666
KKilokɪloʊ4543353446 435444445743544
LLimalaɪmə35444545464 25445545644544
MMikemaɪk354435354632 4335645733444
NNectarnɛktər5553455446554 455645624445
OOscarɔskər45444545464434 44645643344
PPapapɑpə353435254644354 4645754544
QQuebeckwəbɛk4555445456455544 655755555
RRomeoroʊmioʊ65564664654566666 66766666
SSierrasiɛrə454445454644444456 5654442
TTangotæŋgoʊ5455555556555555565 755535
UUniformjunəfɔrm77767777667676677767 67766
VVictorvɪktər555455545644324556556 3545
WWhiskeywɪski4544454546343434564573 434
XX-rayɛksreɪ55543555565544355645754 54
YYankeejæŋki454445454644444456436435 4
score (%)
94.6 97.9 97.7 97.8 81.6 96.8 98.5 88.5 97. 95.1 95. 96. 92.7 90.7 86.9 96.6 92.4 96.6 99.4 86.2 97. 92.6 97.7 87

Figure 9 Edit distances for the revised ICAO alphabet.

Overall mean = 4.22. Largest distance = 7 (15 occurrences). Shortest distance = 2 (4 occurrences)

Again data is also shown from the ICAO Report (Figure 3). Notice that no letter has more than one greyed cell and none of the articulation scores is below 66.6, indeed the smallest is 81.6.

There were four occurrences of the minimum score (2):

Apparently the British researchers also noted problems due to the similarities between Nectar and Victor. The next – and final – iteration substituted Foxtrot, November and Zulu, as in Figure 10.

AAlfaælfə 6554536575446435756765655
BBravobrɑvoʊ6 566666676666666664866665
CCharlieʧɑrli55 55646555556546656864645
DDeltadɛltə565 5545464453556746745554
EEchoɛkoʊ4655 544574545545556865455
FFootballfʊtbɔl56655 64676666665766866666
GGolfgɑlf364446 6574546536756865655
HHotelhoʊtɛl6665446 675665665566856666
IIndiaɪndiə56545656 65455556746754654
JJuliettdʒuliet775677776 7777777677776767
KKilokɪloʊ5654464557 546555556854655
LLimalaɪmə46545656475 35556646755654
MMikemaɪk465546465743 5446756844545
NNectarnɛktər6663566557655 566756725555
OOscarɔskər46555656575545 45746754455
PPapapɑpə364546365755464 5756765655
QQuebeckwəbɛk5666556567566655 766866666
RRomeoroʊmioʊ76675775765677777 77876767
SSierrasiɛrə565456564754554567 6755552
TTangotæŋgoʊ6466666667666666676 866646
UUniformjunəfɔrm78878888778787778878 78877
VVictorvɪktər666466655755425667567 4655
WWhiskeywɪski5645565646454545665684 535
XX-rayɛksreɪ66654666676655466756865 65
YYankeejæŋki564556565655455566547536 5

Figure 10. Edit distances of the final ICAO/Nato alphabet.

Overall mean = 5.11. Largest distance = 8 (13 occurrences). Shortest distance = 2 (2 occurrences)

Note that the mean number of characters in the words is 5.27 for the English spelling and 5.54 for the IPA transcription.

To what extent is this an improvement? The longest distance is now 8, and whereas Nectar was one of the weakest letters, its replacement, November, is the strongest. The shortest distance is still 2, and there are three instances of it (Golf and Papa, Golf and Zulu, Lima and Mike). There is a feeling that there was an attitude that this alphabet was 'good enough' and that is was not broken enough to require further fixing. Despite apparent reservations, Nato also adopted this alphabet in 1955.

Whereas we suggested above that we might as well give up, our results being so different from those in the ICAO report, a more realistic conclusion might be that there is little point in trying to compare our results with those in ICAO (1959). It is evident that the two methods were measuring different things – because they have yielded very different results. The problem is that, because of the paucity of detail in the ICAO paper, it is not clear what was being measured. Let us assume that the ICAO meausures were valuable, and hence the Nato Alphabet is a good one, but that there is scope for supplementing such methods with the edit distance method presented herein, which might lead to even better alphabets. (See below.)

5. Numerals

Evidently numbers are different from letters. However, they may be spoken in the same context as letters, for instance in a callsign or car registration. It is therefore important that they should also be distinguished from each other as well as from the letters. There is no suggestion of using anything other than the normal number names, but there is a prescription as to how they should be pronounced. Particularly the number 9, should not be pronounced as 'nine' but rather as 'niner'. It is useful therefore, to also test the phonetic representation of the numbers with the letters.

It is notable that many of the research papers on this topic, such as ICAO (1959), do not mention the numbers. Given the adption by Nato and the results below, this may be a matter of concern.

Edit distances for the Nato alphabet and the numerals are given in Figure 11.

ælfəbrɑvoʊ ʧɑrli dɛltə ɛkoʊ fɑkstrɑt gɑlf hoʊtɛl ɪndiə dʒulietkɪloʊ laɪmə maɪk noʊv
ɔskər pɑpə kwəbɛk roʊmioʊ siɛrə tæŋgoʊ junə
vɪktər wɪski ɛksreɪ jæŋki zuluzɪroʊ wʌn tu tri foʊərfaɪfsɪkssɛvəneɪt naɪnər
AAlfaælfə 65547365754494357567656545444444446
BBravobrɑvoʊ 6 5667666766676666648666665665666666
CCharlieʧɑrli 55 557465555595466568646454555555556
DDeltadɛltə 565 58454644595567467455545555555455
EEchoɛkoʊ 4655 8445745475455568654545444344546
FFoxtrotfɑkstrɑt 77788 788888797788888677788888776887
GGolfgɑlf 364447 65745495367568656535444534546
HHotelhoʊtɛl 6665486 6756666655668566665666466666
IIndiaɪndiə 56545856 654585567467546555455554545
JJuliettdʒuliet775678776 77797776777767667777777777
KKilokɪloʊ 5654484557 5475555568546542555544546
LLimalaɪmə 46545856475 395566467556555555535553
MMikemaɪk 465547465743 94467568445444444523544
NNovembernoʊvɛmbər 9799799689799 9987978999997899799898
OOscarɔskər 46555756575549 457467544555555354456
PPapapɑpə 364547365755494 57567656545444444446
QQuebeckwəbɛk 5666586567566855 7668666666666565566
RRomeoroʊmioʊ 76675875765677777 778767674776577777
SSierrasiɛrə 565458564754594567 67555555554554355
TTangotæŋgoʊ 6466686667666766676 8666466655666656
UUniformjunəfɔrm 78878888778788778878 788778778788786
VVictorvɪktər 666466655755495667567 46565666654654
WWhiskeywɪski 5645575646454945665684 5354455543546
XX-rayɛksreɪ 66654766676659466756865 666666564566
YYankeejæŋki 564557565655495566547536 55555555556
ZZuluzulu4654483656454954675676565 4434544546
0Zerozɪroʊ 55455855572547556456854654 555544546
1Onewʌn 465548464755485467567646545 33544535
2Twotu 4655484657554954675576565353 2544536
3Threetri 45554846575549546645865654532 544536
4Fourfoʊər465537545755573455567655555555 45456
5Fivefaɪf4655473657432954675685465444444 4544
6Sixsɪks46554646474539445746843454444454 436
7Sevensɛvən465458565755584457367655555555454 55
Eighteɪt 4655484647454954675585465443335435 6

Figure 11. Edit distances for the Nato alphabet including the numerals.

Overall mean = 5.29. Largest distance = 9 (21 occurrences). Shortest distance = 2 (3 occurrences)

The minimum distance is again 2. Nato (1955) notes a potential problem with figures, that Zero and Sierra sound quite similar8. This is not evident from the above analysis, where the distance between the spellings is 5. The memo suggests the solution to this is the use of the 'proword' 'Figures' before the use of numerals. The apparent similarity of the numbers Two and Three, though, might be cause for concern. (Note that the suggested pronunciation of the latter is as Tree). Clearly to confuse two numbers could have catastrophic consequences, and a 'Figures' prefix would not prevent this.

Including the digits introduces more pairs with a distance of just 2, as in Table 3, below and Figure 12, below.

Words (with pronunciation for digits)
1 2 (Tu)3 (Tree)
2 5 (Fife)Mike
30 (Zeero)Kilo

Table 3  Shortest distances for the Nato alphabet including digits. There are three instances of the (undesirable) minimum distance of 2 (compared with two instances when digits are not included).

Given that there are so many potential confusions with the digits, it is worthwhile to look at the digits alone, and that is what we see in Figure 12. It is evident that the -er suffix on Nine is effective (mode=5, the maximum distance). A cause for concern, though, is the short distance of 1 between Two and Three ('Tree'). It is interesting to see whether the suggested pronunciations are effective, so Figure 13 shows the distances for the digits with their normal pronunciation (at least as 'pronounced' by toPhonetics.com.) Nine is less distinct. Its mode=4, and there is a distance of just 2 between it and Five. The official pronunciations would appear to be effective.

Letter 0 123456789
0Zerozɪroʊ 555544546

Figure 12  Distances for the digits, using the Nato recommended pronunciation. The addition of the -er suffix on 9 is evidently effective, but the distance of just 1 between Two and Three ('Tree') is a cause for concern.

Overall mean = 4.47. Largest distance = 5 (6 occurrences). Shortest distance = 1 (1 occurrence)

AZerozɪroʊ 555444544
BOnewʌn5 33344534
CTwotu53 3344534
DThreeθri533 344534
EFourfɔr4333 34534
FFivefaɪv44443 4442
GSixsɪks444444 434
HSevensɛvən5555544 55
IEighteɪt43333435 4

Figure 13  Edit distances for the digits, using their conventional pronunciation.

Overall mean = 4.13. Largest distance = 5 (10 occurrences). Shortest distance = 2 (1 occurrence)

The numbers seem to represent something of a problem. As argued above, it is most important that they should not be confused – certainly with each other, but also not with the letters. Unlike the letters, though, it does not seem practical to use different (but more phonetically distinct) words for the numbers.

6. A better alphabet?

As noted earlier it seems as if Nato felt that their alphabet was 'good enough', perhaps an attitide of, 'If it ain't broke don't fix it'. Yet it is apparent that there could be scope for improvement; a better alphabet might be devised. Let us say it here, though, it is most unlikely that the Nato alphabet will ever be replaced, due to inertia. We can make an analogy with the qwerty keyboard layout. The origins of the layout are disputed (Kay, 2013), it is also controversial as to how efficient it is, but there is some evidence that it is less effcient than alternative layouts, such as the Dvorak layout (e.g. Neill, 1980). However, there is general agreement that qwerty is unlikely to ever be superceded. This is because people know the layout; they have invested in learning it. To replace the layout with any other, even one that has been demonstrated – with practice – to be much more efficient, is simply not going to happen, because of that need to practise and adapt.

So it is with the Nato alphabet. Suppose a much better, more robust alphabet were devised, operators would have to unlearn the old one. Can you imagine a pilot trying to remember whether it is November or Nectar? Think of the potential confusion during any transition from the Nato alphabet to some new one.

Having said that, there is no reason not to play with alternatives here. That is surely the main advantage of the method advocated herein. There is no cost in trying out new alphabets – in marked contrast to the cost and commitment required to empirical testing.

It is evident in Figure 11 that the most serious potential confusions are between digits, two and three and between Mike and five. As concluded above, though, it seems unlikely that any other words could be used, so in this experiment (following the example of much of the literature) we will not include the numbers.

Evidently (Figure 10) the weakest letters are Golf and Mike, which both have shortest distances of 2. We start by finding potential replacements for them, bearing in mind the desirable characteristics listed above.

6.1 G

Possible candidates investigated were

Words with a distance of 2 from any of the other words were rejected, namely those following, along with the words to which they were too close

The distances for the remaining words are shown in Figure 14. As noted earlier, the easiest way to create a larger difference is to choose longer words, with more redundancy, so it is no surprise that Galaxy should show good distinctiveness. Gland shows the next best figures. However, with regard to the semantics of the word, it might be that some people would be uneasy, that they might think that such a biological word is inappropriate, that it might have obscene connotations

gæləksi goʊt glænd
A Alfa ælfə 544
B Bravo brɑvoʊ 766
C Charlie ʧɑrli 655
D Delta dɛltə 545
E Echo ɛkoʊ 735
F Foxtrot fɑkstrɑt788
H Hotel hoʊtɛl 736
I India ɪndiə 655
J Juliett dʒuliet 777
K Kilo kɪloʊ 655
L Lima laɪmə 655
M Mike maɪk 745
N November noʊvɛmbər978
O Oscar ɔskər 655
P Papa pɑpə 645
Q Quebec kwəbɛk 666
R Romeo roʊmioʊ 757
S Sierra siɛrə 655
T Tango tæŋgoʊ 665
U Uniform junəfɔrm787
V Victor vɪktər 656
W Whiskey wɪski 755
X X-ray ɛksreɪ 766
Y Yankee jæŋki 654
Z Zulu zulu 643
Mode6 4 5

Figure 14  Distances for the candidate alternative G-words.

Considering internationalization, French and Spanish translations are given in Table 4.

Galaxy galaxie galaxia
Goat chèvre cabra
Gland glande glándula
Table 4  Candidate G-words in the three principal languages.

The fact that the French and Spanish words for Galaxy are quite similar, and that it would not have negative connotations in those languages suggest that it is the best candidate.

6.2 M

Candidate M-words were:

Mean 5.52 7.60 5.16 5.60
Mode 5 8 4 6

Figure 15  Edit distances for the candidate alternative M-words.

Mercury Mercure Mercurio
Mole Taupe Cabra
Moon Lune Luna
Mud boue lodo

Table 5.  Translations of the candidate M-words. Note that although the French translation of the word for a mole animal is taupe, there is a word le môle, referring to a mole which is a breakwater.

Once again the longest suggestion, Mercury, shows the best distinguishability and is likely to be more familiar to non-native English speakers, but as the longest it might be thought to slow down communication.

What happens if we make both of these substitutions? Does another area of the child's ball pop out? Not according to Figure 16. Note that there are suddenly many more red cells in this table, marking more instances of the minimum edit distance, but that distance is 3, not 2 as in Figure 10.

ælfəbrɑvoʊ ʧɑrli dɛltə ɛkoʊ fɑkstrɑt gæləksihoʊtɛl ɪndiə dʒulietkɪloʊ laɪmə mɜrkjərinoʊv
ɔskər pɑpə kwəbɛk roʊm
siɛrə tæŋgoʊjunə
vɪktər wɪski ɛksreɪ jæŋki zulu
AAlfaælfə 6554756575489435756765654
BBravobrɑvoʊ 6 566776676687666664866666
CCharlieʧɑrli 55 55766555579546656864645
DDeltadɛltə 565 5855464489556746745554
EEchoɛkoʊ 4655 874574587545556865454
FFoxtrotfɑkstrɑt 77788 78888879778888867778
GGalaxygæləksi576577 7676679666766767766
HHotelhoʊtɛl 6665487 675686665566856666
IIndiaɪndiə 56545866 65488556746754655
JJuliettdʒuliet775678776 7789777677776766
KKilokɪloʊ 5654486557 587555556854654
LLimalaɪmə 46545866475 79556646755655
MMercurymɜrkjəri887887788887 9787878777778
NNovembernoʊvɛmbər 9799799689799 998797899999
OOscarɔskər 46555766575579 45746754455
PPapapɑpə 364547665755894 5756765654
QQuebeckwəbɛk 5666586567567855 766866666
RRomeoroʊmioʊ 76675875765687777 77876767
SSierrasiɛrə 565458664754794567 6755555
TTangotæŋgoʊ 6466686667668766676 866646
UUniformjunəfɔrm 78878878778778778878 78877
VVictorvɪktər 666466655755795667567 4656
WWhiskeywɪski 5645577646457945665684 535
XX-rayɛksreɪ 66654776676679466756865 66
YYankeejæŋki 564557665655795566547536 5

Figure 16  Edit distances for the suggested revised alphabet.

Overall mean = 5.37. Largest distance = 9 (16 occurrences). Shortest distance = 3 (2 occurrences)

Although we said we would not worry about the digits in this section, what happens if we do include them? As we see in Figure 17, they do not clash with the new letters. There are two pairs with an edit distance of 2, but they are the two numbers, 2 and 3, and as discussed above there does not seem to be much that we can do about that.

ælfəbrɑvoʊ ʧɑrli dɛltə ɛkoʊ fɑkstrɑt gæləksi hoʊtɛl ɪndiə dʒulietkɪloʊ laɪmə mɜrkjəri noʊv
ɔskər pɑpə kwəbɛk roʊm
siɛrə tæŋgoʊ junə
vɪktər wɪski ɛksreɪ jæŋki zuluzɪroʊ wʌn tu tri foʊərfaɪfsɪkssɛvəneɪt naɪnər
AAlfaælfə 65547565754894357567656545444444446
BBravobrɑvoʊ 6 5667766766876666648666665665666666
CCharlieʧɑrli 55 557665555795466568646454555555556
DDeltadɛltə 565 58554644895567467455545555555455
EEchoɛkoʊ 4655 8745745875455568654545444344546
FFoxtrotfɑkstrɑt 77788 788888797788888677788888776887
GGalaxygæləksi576577 76766796667667677667777677676
HHotelhoʊtɛl 6665487 6756866655668566665666466666
IIndiaɪndiə 56545866 654885567467546555455554545
JJuliettdʒuliet775678776 77897776777767667777777777
KKilokɪloʊ 5654486557 5875555568546542555544546
LLimalaɪmə 46545866475 795566467556555555535553
MMercurymɜrkjəri887887788887 97878787777787888788888
NNovembernoʊvɛmbər 9799799689799 9987978999997899799898
OOscarɔskər 46555766575579 457467544555555354456
PPapapɑpə 364547665755894 57567656545444444446
QQuebeckwəbɛk 5666586567567855 7668666666666565566
RRomeoroʊmioʊ 76675875765687777 778767674776577777
SSierrasiɛrə 565458664754794567 67555555554554355
TTangotæŋgoʊ 6466686667668766676 8666466655666656
UUniformjunəfɔrm 78878878778778778878 788778778788786
VVictorvɪktər 666466655755795667567 46565666654654
WWhiskeywɪski 5645577646457945665684 5354455543546
XX-rayɛksreɪ 66654776676679466756865 666666564566
YYankeejæŋki 564557665655795566547536 55555555556
ZZuluzulu4654486656458954675676565 4434544546
0Zerozɪroʊ 55455875572577556456854654 555544546
1Onewʌn 465548764755885467567646545 33544535
2Twotu 4655487657558954675576565353 2544536
3Threetri 45554876575589546645865654532 544536
4Fourfoʊər465537645755773455567655555555 45456
5Fivefaɪf4655477657438954675685465444444 4544
6Sixsɪks46554676474589445746843454444454 436
7Sevensɛvən465458665755884457367655555555454 55
8Eighteɪt 4655487647458954675585465443335435 6

Figure 17.  Edit distances for the proposed alphabet, including the numbers.

Overall mean = 5.29. Largest distance = 9 (21 occurrences). Shortest distance = 2 (2 occurrences)

We have devised an alternative to the Nato alphabet that appears to be better, for it to be less likely to have confusion between pairs of letters. For any practical implementation it would be appropriate to undertake empirical experiments to test reception in noise, as was done in the development of the ICAO alphabets. However, it is important to repeat that it is most unlikely that the Nato alphabet will be replaced, even with a phonetically superior alternative.

7. Is it worth the bother?

A fair question to ask – and to address with the edit distance method – is whether it is worth the bother? Are any of these alphabets better than just saying the letters in the normal way? Figure 18 tests this, showing the edit distances between the normal pronumciations of the letters, Ay, Bee, Sea, etc9.

bisidiiɛfʤieɪtʃʤeɪkeɪ ɛlɛmɛnpikjuɑrɛstijuvi dʌb
AA 2222222133222223222227332
BBbi2 112214233222213221217331
CCsi21 12214233222213221217331
DDdi211 2214233222213221216331
EEi2222 224233222223222227332
FFɛf22222 24233111223212227232
GGʤi211122 4223222213221217331
HHeɪtʃ2444444 322444444444447434
II12222223 33222223222227332
JJʤeɪ333333223 1333333333337323
KKkeɪ3333333231 333332333337323
LLɛl22222124233 11223212227232
MMɛm222221242331 1223212227232
NNɛn2222212423311 223212227232
OO22222224233222 23222227332
PPpi211122142332222 3221217331
QQkju3333333433233333 333337333
RRɑr22222224233222223 22227332
SSɛs222221242331112232 2227232
TTti2111221423322221322 217331
UUju22222224233222223222 27332
VVvi211122142332222132212 7331
WWdʌbəlju7776777777777777777777 777
XXɛks33333234333222333323337 33
YYwaɪ333333333223333333333373 3

Figure 18. Edit distances for the conventional pronunciation of letters.

Overall mean = 3.55. Largest distance = 7 (24 occurrences). Shortest distance = 1 (40 occurrences)

According to this analysis, the conventional pronunciation is much worse. There are 40 pairs with an edit distance of just 1 – which would be easily confused. The only letter with a average distance is W (dʌbəlju), which is not a surprise as a longer, two-syllable word.

8. Summary and Discussion

Statistics for all the above alphabets are summarized in Table 6. Let us assume that any alphabet with any occurrence of edit distances of 1 should be eliminated that excludes RAF, ICAO 1946 and ICAO 1949 (as well as the conventional pronunciations). Of the remainder it is ITU which seems best:

It is also noticeable that the mean length of the IPA spellings is also the greatest, and perhaps this is all that is needed – the greatest number of phonemes and hence the greatest redundancy. Recall, though, that this alphabet was rejected by the ICAO because 'the words were unsuitable because they were unusual in everyday language and because they lacked desirable phonetic qualities' (ICAO, 1959, p.9). Given that the words are all the names of cities and countries it is not clear why they were considered 'unusual'. It might be that the names of cities in different countries are pronounced differently in different languages. For instance, an Italian would refer to their home country as 'Italia' and would find it difficult to revert to the English pronunciation. It is not, however, clear what 'phonetic qualities' are missing and in fact our analysis suggests the contrary.

  ITU ICAO1956/
ICAO1959 RAF ICAO 1949 ICAO 1946 Conven
Mean distance 8.32 5.11 5.37 4.22 4.97 5.26 5.06 3.55
Largest distance
(No of occurrences)
10 (25) 8 (13) 9 (16) 7 (15) 6 (42) 8 (14) 6 (53) 7 (24)
Shortest distance
(No of occurrences)
5 (3) 2 (2) 3 (2) 2 (4) 2 (17) 2 (2) 2 (4) 1 (40)
Mean IPA length 7.77 5.54 5.81 5.38 4.12 5.35 4.42 2.38

Table 6.  Summary data for the main alphabets investigated.

These results would suggest that the ICAO/Nato alphabet is a good one. They would suggest that Nato/Galaxy/Mercury is a better one, but then, as argued previously, it is never likely to be adopted.

9. Discussion and Conclusions

The need for unambiguous verbal communication The history of the development and evolution of the Nato spelling alphabet is a rich one, which has only been sketched herein.

The main motivation behind this paper was to see whether measuring edit distances of phonetic spelling was a way of assessing the suitability of different spelling alphabets. On that level the results are mixed. The ICAO/Nato alphabet was derived on the basis of experiments into their auditory distinctiveness, but the results thereof do not agree very closely with this analysis. (See in particular Figure 9). So, it might be concluded that the method described herein is of little value. On the other hand, perhaps it would be a useful quick-and-dirty method for pre-testing potential alphabets. It is certainly much cheaper in time and money than embarking on making recordings (with different levels of noise) and testing them with a large number of participants.

If anyone were to attempt to devise another alphabet it would make sense to employ both the methods. That is to say that a preliminary, low-cost evaluation could be carried out using edit distance and then further empirical test of listening to samples in noise carried out. If this were to be done, though, the method ought to be rigorous, clearly documented and include an appropriate ststistical analysis.

It appears that a very simple rule-of-thumb for selecting an alphabet is just to measure the lengths of the words.

At the same time, it seems unlikely that anyone will ever need or want to devise a new international spelling alphabet. If one were needed this method might be used in its development. Having devised a candidate it would presumably be necessary to carry out empirical tests.

One problem of devising an international alphabet is that it will be used by speakers of different native languages, who will inevitably speak with an accent. An advantage of the edit distance method is that it could be used to test different accents. Although toPhonetics.com only generates two accents, American and British, a skilled phonetician could hear the words spoken by speakers with different accents and transcribe them into IPA. Then the edit distances could be measured.


Appendix A: The International Phonetic Alphabet (IPA)

The IPA consists of symbols that can represent each sound (phoneme) in spoken languages. Not every language includes every possible phoneme, so in the examples in this paper only a subset of the IPA symbols are used. The figure below (borrowed from dictionary.com is a guide to the pronunciation of the IPA symbols used.

Consonants Vowels
/b/boy, baby, rob/æ/apple, can, hat
/d/do, ladder, bed/eɪ/aid, hate, day
/dʒ/jump, budget, age/ɑ/arm, father, aha
/f/food, offer, safe/ɛər/air, careful, wear
/g/get, bigger, dog/ɔ/all, or, talk, lost, saw
/h/happy, ahead/aʊər/hour
/k/can, speaker, stick
/l/let, follow, still/ɛ/ever, head, get
/m/make, summer, time/i/eat, see, need
/n/no, dinner, thin/ɪər/ear, hero, beer
/ŋ/singer, think, long/ər/teacher, afterward, murderer
/p/put, apple, cup/ɜr/early, bird, stirring
/r/run, marry, far, store
/s/sit, city, passing, face/ɪ/it, big, finishes
/ʃ/she, station, push/aɪ/I, ice, hide, deny
/t/top, better, cat/aɪər/fire, tired
/tʃ/church, watching, nature, witch
/θ/thirsty, nothing, math/ɒ/odd, hot, waffle
/ð/this, mother, breathe/oʊ/owe, road, below
/v/very, seven, love/u/ooze, food, soup, sue
/w/wear, away/ʊ/good, book, put
/ʰw/where, somewhat
/y/yes, onion/ɔɪ/oil, choice, toy
/z/zoo, easy, buzz/aʊ/out, loud, how
/ʒ/measure, television, beige
/ʌ/up, mother, mud
/ə/about, animal, problem, circus

Figure A. A guide to IPA pronunciation.

Appendix B: Edit distances

As mentioned above, the edit distance is essentially the number of edit operations required to transform one string to another. In the context of this paper, the longer the edit distance the better: the more dissimilar the two words and hence the greater their phonetic difference and the lower liklihood that one could be confused with the other when spoken.

An edit operation is either

  • deletion
  • insertion, or
  • substitution


coffee & toffee

Edit distance = 1
The example in the paper was coffee and toffee which has an edit distance of 1: the substitution of c with t.

kitten & sitting

Edit distance = 3
This is calculated as follows

Operation Operation
New 'word'
iino operation01sitten
ttno operation01sitten
ttno operation01sitten
nnno operation02sittin

This is reminiscent of the word ladder game, in which you transform one word into another, by making substitutions, although the added constraint in the game is that the intermediate steps must be recognized words.


Figure B.  Steps in a word-ladder game. The edit distance between BOOT and SHOE is 3, but all the operations have to be substitutions. Recall there is the added constraint in the game that the intermediate strings must also be recognized words (unlike, for instance, sitten, above).

Be aware that the examples in this appendix are all words spelled in the conventional Latin alphabet, but the paper is about edit distances for strings of IPA symbols.


ICAO (1959) The Evolution and Rationale of the ICAO (International Civil Aviation Organization) Word-Spelling Alphabet, July (PDF). Archived (PDF) from the original on 2016-03-10. Retrieved November 1, 2017.

ICAO (2001) Annex 10 to the Convention on International Civil Aviation: Aeronautical Telecommunications; Volume II Communication Procedures including those with PANS status (6th ed.). International Civil Aviation Organization. October 2001. Archived (PDF) from the original on 31 March 2019. Retrieved 23 January 2019.

Kay, N. M. (2013) Rerun the tape of history and QWERTY always wins Research Policy, 42(6-7), pp.1175-1185.

Nato (1955) SGM-675-55: Phonetic Alphabet for NATO Use Archived (PDF) from the original on 12 April 2018.

Neill, S. B. (1980) Dvorak vs. Qwerty: Will Tradition Win Again? The Phi Delta Kappan, 61 (10 (June, 1980), pp. 671-673

Schmidt-Nielsen, A (1987) Intelligibility of ICAO Spelling Alphabet Words and Digits Using Severely Degraded Speech Communication Systems. Part I Narrowband Digital Speech, Naval Research Laboratory, NRL Report 9035.


*  This is version 2.0 of this paper. I released version 1 (unnumbered) on 9 April 2021, but then I had some additional thoughts. I added the appendices, added to the section on Comparison with ICAO Experiments and updated the Discussion and Conclusions.
13 April 2021

1  'Spelling alphabets' are commonly referred to as 'phonetic alphabets'. Indeed, the principal one that we discuss in this paper is officially called the Nato Phonetic Alphabet. However, that name can be considered inaccurate in that phonetics are not really involved, and particularly in the context of a paper like this in which true phonetics are important, it would be confusing to use the term.

2  Nato is an acronym, a word made of the initial letters of a number of component words, in this case North Atlantic Treaty Organization. Many writers feel obliged to spell out such acronyms using capital letters. I do not. As long as the word formed is pronounceable and is used in that way, then it should be treated as any other name, regardless of its etymology. After all, who would write RADAR (RAdio Detection And Ranging)? This is not the same as an initialism, a word constructed from inital letters, but one which is not pronounceable, is spoken letter-by-letter and should be written in capitals (e.g. USA).

3  ICAO (1959) is a military technical report from the Armed Services Technical Information Agency, released under a Freedom of Information request in 2011. The version released is a poor quality scan which is illegible in parts, and thus some of the information from it reported herein may be inaccurate.

4  George Bernard-Shaw made this point by suggesting that the word 'fish' might be spelt 'ghoti', if you take 'gh' from 'enough', 'o' from 'women' and 'ti' from 'station'. Another example is the differences between the words 'bough', 'borough' and 'through'.

5  British people – particularly those from Wales, Scotland and Northern Ireland, often balk at the fact that some people from other countries seem to assume that Britain and England are identical. Perhaps the identification of a 'British accent' is one attempt to redress this error, but sadly this is misguided. There is no British accent. Certainly the way an Aberdonian speaks is very different from someone from London. Yet it would not be correct to label an accent as English. Again, a Geordie accent is very different from, say, a Birmingham one. There has been the concept of Received Pronunciation. It is not very clear from whom this accent was received, but it is probably best identified as the accent that was broadcast by the BBC in the 1950s and 1960s. It was probably best identified with the middle-classes from the south-east of England. It is the accent that you would have expected to hear on the BBC in the 1950s and 1960s; 'regional' accents were certainly not broadcast.

6  The spelling of Whiskey follows the American and Irish convention; a Scot drinks Whisky.

7  The software to measure the edit differences was written in C and the source code for this is available on-line.

8  The number 0 is a relatively recent addition to mathematics. It is unusual in that it is known by a various different names: zero, nought and oh, for instance. The word Zero is phonetically distinct which made it a good candidate for the Nato alphabet, but the name is not a neologism, as might be assumed, the first-known occurrence of it in English dating back to 1598, according to the OED.

9 Consistently with the rest of this paper, the American pronunciation of the letter Z ('Zee' or zi) has been used, in preference to the British 'Zed'.