The Arbdash Spelling System

Alan Beale
15 July 2006

Introduction

Arbdash is a spelling system designed for use with both British and American English.  The name is derived from my system Arbdot, with which it shares a few features.  The main design points for Arbdash are as follows:

  1. Usually, when the American and British pronunciations for a word differ, the differences will only affect the vowel diacritics, rather than the base letters.  This allows reasonably phonemic spelling that can still be read across the pond.
  2. Arbdash is imprecise about the spelling of unstressed vowels.  In particular, the schwa sound may be spelled with any of the five vowel letters.  This makes it more likely that related words will have similar spellings.  Except for its imprecision about unstressed short vowels, Arbdash is phonemically quite precise.
  3. The diacritics of Arbdash are systematically arranged so that, even though there are a lot of them, the organizing principles are easy to remember.
  4. Though Arbdash uses diacritics not in the Latin-1 character set, all the characters it uses are in the Unicode Latin Extension A, and are present in the most popular modern computer fonts.
  5. Due to legibility issues, Arbdash avoids using diacritics on the letter 'i' to the extent possible.
I do not expect Arbdash to be a great hit with spelling reformers, much less with the general public. Here are some criticisms which I expect to be made of it.  Most of them are true.
  1. Arbdash uses diacritics.  The horror!
  2. Not only does Arbdash use diacritics, but there is no readily available keyboard map supporting the characters it uses.  Because writing in Arbdash requires Unicode, it is a problem for E-mail.
  3. Arbdash does not look very much like today's English.  It is ugly.
  4. Because Arbdash is not entirely phonemic, correct spelling is not trivial.  Some amount of memorization is required.
  5. Arbdash does not always make the spellings of words shorter.  I do not regard this as a sin, but many do.
Practically speaking, the biggest problem for Arbdash is its non-Latin-1 character set.  Because of this, and because opinions differ on what makes an alternative spelling ugly, there are several alternate forms of Arbhash, which use slightly different sets of diacritics.  If you do not appreciate Arbdash as presented here, perhaps one of the variants may suit your taste better.

Consonants

Here are the consonant symbols and digraphs of Arbdot with examples.  The examples show more diacritics than absolutely necessary for clarity.  The example words have been chosen to be recognizable to readers new to Arbdash.

Symbol Sampa Examples
b /b/ bábÿ
/tS/ cħùrcħ
d /d/ dêdikáted
f /f/ fîftÿ
g /g/ Grêgorÿ
h /h/ hó-hûm, hôthows
j /dZ/ jûjment
jħ /Z/ vîjħon, plêjħur
k /k/ körk
kħ /x/ lôkħ
l /l/ lábel
m /m/ mêmó
n /n/ nûn
/N/ sînğinğ
ng /ng/, /Ng/ fînger, ingrédÿent
nk /nk/, /Nk/ ûnkel, unklôg
p /p/ pûpÿ
r /r/ retùrn, bärter, bróker
s /s/ sîster
sħ /S/ sħêlfîsħ
t /t/ tärt
/T/ tþînk, brêtþ
tħ /D/ tħât, fätħer
v /v/ vâlv
w /w/ wîked, twîcħ
wħ /W/, /hw/, /w/ wħîsper
y /j/ yês, kânyon
z /z/ zêlus, lázÿ

Note that the letter 'd' is always used for past tenses, and 'z' for plurals, as in «rîpd» and «kâtz», even when the pronunciation is /t/ or /s/.

Vowels

The vowel symbols of Arbdash are arranged into a number of groups of similar letters, as follows.

Vowels without diacritics are unstressed short vowels, possibly pronounced as a schwa or indistinct i.

a /@/, /I/, /{/ alowd, bâgaj
e /@/, /I/, /E/ rîvet, selêkt, egzâmin
i /@/, /I/ dêvil, sânitÿ
o /@/, /Q/, /A:/ lêmon, bombärd
u /@/, /V/ kâmpus, prôdukt

Vowels with a circumflex accent are stressed short vowels:

â /{/ sâd
ê /E/ tên
î /I/ bît
ô /Q/, /A:/ dôt
û /V/ rûg

A good list of mnemonic words for the stressed short vowels is: «bâg», «bêg», «bîg», «bôg», «bûg».

Vowels with an acute accent are long vowels, stressed or unstressed:

á /eI/, /EI/ táken
é /i:/ rérun
ó /oU/ spóken
ú /u:/, /u/, /U/ (before r), /U@/ rúler, rîcħúal, júrÿ

(The long i is treated as a diphthong, represented by the symbol 'iy' due to legibility issues with the symbol 'í'. Arbdash allow the uses of 'ǐ' as a shorter form of 'iy', but this character is not present in most computer fonts, and so 'iy' tends to work out better.  (Note that the diacritic is a caron, not the breve of
ğ and ŭ.)  Also, see the "special u" table below for long u as in <cue> rather than <clue>.)

A good list of mnemonic words for the long vowels is: «mán», «mén», «miyn», «món», «mún».

Vowels with a grave accent are vowels which only occur before the letter r, and are generally stressed.

à /e@/, /e/ stàrinğ
è /I@/, /i/ vôluntèr
ù /3/ fùr, mùrder

A good list of mnemonic words for these vowels is: «stàrinğ», «stèrinğ», «stùrinğ».

Vowels with an umlaut are alternate vowel forms.  ä and ö are generally stressed, while ü and ÿ are never stressed.

ä /A:/ fätħer, bär
ö /O:/ börd, wöter, ölsó, löndrÿ
ü /j@/, /jU/ sîmüláted, mùrkürÿ
ÿ /i/, /I/ hâpÿ, vàrÿinğ

Good mnemonics for 'ä' and 'ö' are «fär» and «för» or «dräma» and «tröma».  (Neither 'ü' nor 'ÿ' fits the pattern of the other two vowels.)

Two special forms of 'u' are defined for additional sounds associated with this letter:

ŭ /U/ pŭsħ
ű /ju:/, /ju/, /jU/ (before r), /jU@/ pűnÿ, ânűal, pűritÿ

Finally, the English diphthongs (other than long e and long o) are represented by an unaccented vowel followed by a semi-vowel.

iy (or ǐ) /aI/ fliy, diviyd, apliyans, akwiyr, driyer
ow /aU/ how, kowntÿ, alowans, sowr, tower
oy /OI/ boy, avoyd, toyl, royal, employer

A good list of mnemonics for these vowels is «pliy», «plow», «ploy».

(I should also notes that two vowels with a tilde diacritic, 'ã' and 'õ', can be used for the nasal vowels in a few words borrowed from French, such as «kõsÿàrj
ħ» and «málãjħ».)

Simplifications

To lessen the number of diacritics required, Arbdash allows three simplifications.  The simplifications are not required - it is always allowed to put the extra diacritics in, and may be preferable for tutorials and the like. The three simplification rules are as follows:

  1. 'ÿ' may be replaced by 'y' at the end of a word, or before a consonant, as in «hóly» or «krázynes».
  2. The circumflex can be left off the vowel in a one-syllable word, or off the first vowel in a multi-syllable word where the second vowel is guaranteed not to be stressed  (that is, either unaccented, ü or ÿ).  Examples: «man», «metal», «regülar», «luky».
  3. The breve can be left off 'ğ' at the end of a word, or in an inflection, as in «bang», «ringz» and «hanging».  It is still required in other derived words, such as «sinğer» and «kinğdom».

An Example

Here is an extended passage in Arbdash, Bob Boden's "The Late Arrival" (with a few changes to make sure all the sounds are represented).  You probably will not find it hard to read.  The braces indicate words whose pronunciation is different in British and American English.

Bob stróld up tħe {patþ|pätþ}.  Hé woz lát and woz konsîdering hiz alibiy.  Hé lisend tu tħe utħer kärz in tħe strét, heding hóm. Tħe {dög|dog}, witħ pö rázd, gréted him at tħe dör. Moly kŭd ölwáz tel wħen hé ariyvd.  Sħé woz a loyal krécħur, and tħe sensitîvity ov tħat nóz woz not a mitþ.  Tħe hows woz kwiyet.  Hé lŭkd down at tħe viynil tiyl.  It kŭd stand sum polisħ but hé wŭd rezîst tħe impuls tu komplán. 

Hé sö hiz fätħer in tħe gärden nèr tħe fens but diden't botħer tu köl him.   Hé woz obvÿusly prepàring tu mó tħe lön. Hé kŭd sé tħe nábor'z kow in tħe mêdó and tħe dens wŭdz bÿônd.  It sémd a trivÿal mater but hé ölwáz lŭkd tu sé if tħàr woz enytþîng on tħe kicħen tábel.  Hiz wiyf miyt riyt a nót.  Tħàr woz a {|} novel tħàr, probably hùrz.  It had a pekűlÿar simbol on tħe kuver. Tħe tiytel sed sumtþing abowt {núronz|nűronz}.  Hé diden't nó wħot tħat woz abowt but simply diden't kàr.

Hé desiyded tu remúv tħe lábel now from tħe pakaj hé woz karÿing. Hé hópd tħat Mùrtel wŭd not {harâs|haras} him abowt tħe {köst|kost} ov tħe {|} fùr kót hé had böt hùr.  Hé had a fàrly gŭd repliy.  At lést it woz pád for.  And tħe fùrst ov tħe yèr woz ölmóst hèr and tħe àr woz kóld. 

Hé fownd tħe páper pärtly ópen, óver a cħàr.  It had ránd in tħe mörning.   Tħe páper {öfen|ofen} got wet.  Hé tùrnd tħe rádÿó on but diden't kàr for tħe sinğer.

Hé red abowt tħe siyklón wħicħ had ölmóst hit tħe sivik senter.  Tħe skiy had lŭkd tþretening.  Hé gáv a siy.  Tħàr wŭd hav ben tħe devil tu pá if tħe störm had ben wùrs.  Tħe nolej mád him wins.  Hé tþöt, "Wé är só helples in tħe fás ov bad wetħer."  Hé gáv sum tþöt tu sólar power, but klung tu hiz űjħúal rigor, wħicħ ment diskûsing tħis witħ hiz wiyf beför máking a desîjħon.  Sħé ekspêkted tħis konsîderásħon.  Hé had ben tþinking abowt emùrjensy power and had an iydéa.  Sum mejħur ov aksħon miyt bé prúdent.

Design Notes

Here are some notes on why Arbdash looks like it does.

The key problem is that of the vowel u, which has seven different sounds: those of <circus>, <cut>, <murder>, <push>, <regular>, <rude> and <cute>.  Unfortunately, there are only five forms of u in the Latin-1 character set.  This list falls naturally into two related subsets, the first three vowels, and the last four, but this doesn't help much - there is no unused letter in the English alphabet suitable for use as a sixth vowel.  (And the independence of the two subsets is not absolute - there are related words like <produce> and <production> which cross the boundary.)

I considered using 'w' as a sixth vowel - but then logic would demand that it represent the set of vowels related to its consonant meaning, which is to say the <rude> set.  This means you need three extra forms of w, and while they exist in Uncode, the net effect is far from familiar.  Spellings like «pẅs
ħ», «regwlar», «rẁd» and «kẃt» have little appeal if one hopes to attract the already literate.

Another possibility is to use 'ư' from the Vietnamese alphabet for the first series of sounds.  This is surely more readable than the vowel 'w', but this character, especially in its accented forms, is in relatively few computer fonts, except for those designed specifically for Vietnamese.  And in many fonts the hook on the u is hard to make out, as you may have already noticed.

Eventually I was led to '
ŭ' and 'ű', from Esperanto and Hungarian respectively. The breve is familiar to English speakers due to its use in dictionaries, and 'ű' seemed a natural for the long form of 'ü'. But this meant that Arbdash would require Unicode for its representation.  Once this is accepted, there is no reason not to continue.  This led to the adoption of 'nğ' and 'sħ' to solve two other annoying ambiguities of present English spelling.

Another point to note about Arbdash is its use of the circumflex for stressed short vowels.  I haven't previously seen a system with this convention: the Bobdot/Arbdot convention of using the dieresis (as in Arbdot «ofënsiv», «abïlitè») for this purpose seems most common.  The Arbdash arrangement derived from two considerations.  I wanted 'ü' for the sound in <regular>, to correspond to '
ű' for the long u of <cute>.  The grave accent was another possibility, but its use with i produces some very hard to read words, such as «benefìt».  That left the circumflex which, though not ideal, still has decent readability when used with the 'i' («benefît» is not wonderful, but it's better than using ì).

Another unusual point about Arbdash is its use of the digraph 'iy' for long i.  I was led to this again by legibility issues with i and diacritics: «fiynal» is much more readable than «fínal».  This directly violates the principle of having the British «misiyl» and the American «misil» differ only in the diacritics.  I've attempted to smooth this over by regarding 'iy' as just a variant of 'í' to improve readability, in the same way that in German 'ü' and 'ue' are regarded as the same thing.  The digraph 'iy' was chosen, rather than the perhaps more natural (or at least more common) 'ie' by analogy with the other diphthongs 'ow' and 'oy'.  Using 'iy' exploits the relationship between 'i' and 'y', and I certainly find the spelling «piyonèr» to be considerably more familiar-looking than the alternative «pieonèr».

After completing the design of Arbdash, I discovered the letter 
ǐ, which is a reasonably readable potential representation for long i.  (Compare "fǐnal" and "fínal" for legibility - in the case of ǐ you can at least be sure there is a diacritic there.)  Unfortunately, ǐ is not in the Unicode Latin Extension A, and is present in only a few computer fonts.

The first version of Arbdash used à and ò for the sounds of <bar> and <bore>, and ä and ë for the sounds of <wary> and <weary>.  I exchanged the grave and dieresis here, primarily because it is easier to distinguish words like <slow> and <slaw> when spelled «sló/slö» rather than «sló/slò». This also improved the coherence of the grave-accented vowels, as now they all occur only before the letter r.

On the Imprecise Schwa

Arbdash specifies that unstressed short vowels, whether spoken as schwa or not, are represented by unaccented letters.  A move such as this, away from precision, seems undesirable to many, and it does mean that knowing how to pronounce a word is not necessarily enough to allow you to spell it. The advantage of this is that it allows the spellings of related words to be similar.  This is often disparaged as unnecessary and pointless by reformers, but in fact it aids one in spelling unfamiliar words related to ones you know - «egzîst» gives you a clue to «egzistênsħal», which the more accurate «igzîst» would not do.

Consider the words <photograph> and <photography>.  In Arbdash, they are «fótogrâf» (or British «fótogräf») and «fotôgrafy» - that the words are probably related is obvious.  Compare the phonemic system Bobdot, in which the words are instead «fótugräf» and «futögrufè» - the relationship between the words is lost to phonemic accuracy.  Digraph-based systems are even worse here, as in the Lojikl Inglish «foeteugraaf» and «f'togreufy».  (Note that I do not contrast with these systems because they are bad systems - indeed, I consider them "best of breed" phonemic respelling systems.)  Other examples are easy to find: Arbdash «kompóz/kompozîshon/kompôzit» versus Bobdot «kumpóz/kompuzïshun/kumpözut» or Lojikl Inglish «kmpoez/kompeuzishn/kmpozit» is a good 3 word example.

In most cases, Arbdash words are spelled using the same representation for their unstressed short vowels as traditional spelling, which tends to produce very good results.  But there are some situations where deviating from TS produces better results.

  1. Sometimes, traditional spelling just uses the wrong vowel.  The word <maintenance> is spelled «mántanens» in Arbdash, to agree with «mántán».  (For the -ens ending, see item 3 below.) Similarly, <chlorinate> is spelled «klörenát» in Arbdash, to agree with «klörén».

  2. Sampa /@r/ and /3r/ are often paired in related words.  Because /3r/ is ùr in Arbdash, the spelling ur for /@r/ occurs more often in Arbdash than in TS.  Examples: «pùrfekt/purfêkshon», «Jùrman/Jurmânik», «ölturnát/öltùrnativ».

  3. Certain suffixes are both unstressed and woefully inconsistent in their usual spelling.  The most obvious examples are -ar/er/or (<liar>, <farmer>, <actor>), -able/ible (<readable>, <deductible>) and -ance/ence (<annoyance>, <existence>).  One is tempted to declare a single common form, but one runs into conflicts with related words, such as <regulatory> (American English) or <differential>, in which the unstressed vowel becomes stressed.  Also, there are words for which it is not completely clear whether a suffix is involved or not: <investor> and <visible> are clear, but what of <doctor>, <motor>, <plausible> and <eligible>?  Arbdash presently takes a cautious approach here.  It uses the forms -er, -abel and -ens when there is no conflict with related words, and when the suffix is attached to an English word.  Thus, Arbdash spells «viziter», «revùrsabel» and «remêmbrens», but «impôstor», «fézibel» and «elegans».  More could surely be done in this area, but making the effort to go beyond the obvious seems unwarranted on a project with so little chance of success.


The Arbdash Lexicon

Using the CAAPR lists, I have constructed a preliminary Arbdash dictionary, mostly in support of adding Arbdash to the repertoire of the Wyrdplay converter.  The non-Latin-1 character set of Arbdash presents special problems in formatting of the dictionary, and for this reason I am not making it available for download at this time.  If for some reason you are interested in a copy, let me know, and I'm sure something can be worked out.


Arbdash Variants

As noted above, Arbdash has a number of variant forms, which change its appearance in ways that some may prefer.  Each variant is identified by a letter, and variants may be combined if they are compatible.  For instance, variant hqu allows one to write Arbdash with only the Latin-1 character set.

Variant c: Drops the '
ħ' from 'cħ', since after all the 'c' has no other use in Arbdot.  Good for those who think that a reform is better the more letters it eliminates.

Variant d: Gets rid of the '
ħ' character by adding consonant diacritics.  Replaces 'cħ', 'jħ', 'kħ', 'sħ', 'tħ' and 'wħ' with 'č', 'ž', 'x', 'š', 'ð' and 'hw'.  Also reduces 'tþ' to 'þ'.  For diacritic lovers only!

Variant h: Simply replaces '
ħ' with 'h'.  The only reason to have the separate character is to avoid the ambiguity of spellings like «mîshâp» and «körthows».  Most reformers feel that worrying about this distinction is nitpicking anyway.

Variant i: Uses 'í' instead of 'iy'.  This is more logical and systematic, but makes words like «fínal» harder to make out.

Variant q: Replaces 'ğ' with 'q'.  With this variant, 'nq' is never simplified to 'ng'.  Variant q changes the spellings of «singing» and «S
ħânğhiy» to «sinqinq» and «Sħânqhiy».  This variant takes advantage of the visual similarity of 'g' and 'q' - an advantage lost in the more "economical" alternative of writing «siqiq» in place of «sinqinq».

Variant u: Gets rid of 'ŭ' and 'ű'.  The symbol 'ü' is used for the '
ŭ' sound, and 'ü' and 'ű' are replaced by 'eu' and 'eú'.  Thus, we have the variant u spellings «püsh», «regeular» and «keút». The disadvantage here is that the connection between 'ú' and 'eú' is somewhat broken - but it is really no worse than the situation with 'î' and 'iy'.

Variant x: Replaces 'tþ' with 'tx', as in «txéater» and «âtxlét».  Recommended only for diacritic haters - the combination variant hqx eliminates all special symbols for consonants, at significant cost in familiarity.

Variant y: Uses 'ý' instead of 'iy', giving you spellings like «fýnal» and «mistifý».  This breaks the connection between short i and long i even more than 'iy' does.



To comment on this page, e-mail Alan at wyrdplay.org

Go to wyrdplay.org home page
Go to wyrdplay.org spelling system roster