Towards a Unicode Transliteration Table for VMScript

If we look at Capelli, Bischoff, etc. long enough it does not seem an unusual notion at all that  VMS symbols are derived from latin letter shapes (I am trying to avoid the term “glyph” as it often gets misinterpreted1). If we read: “The script uses many ligatures and has many unique scribal abbreviations, along with many borrowings from Tironian notes2“, it is about the shapes of Insular Script, not VMScript. It seems virtually impossible to invent letter shapes out of the blue, without resemblance to anything known3.

If we did the statistics and identified most of them, we could also take a look at Unicode charts, especially Latin Extended A through D, Latin Supplements, the MUFI recommendations etc. and try to locate the glyphs and their corresponding character code points. Almost everything is there. Medievalists need a lot of glyphs to encode manuscripts, like “LATIN SMALL LETTER A INSULAR FORM , LATIN SMALL LETTER OPEN A CAROLINGIAN FORM , LATIN SMALL LETTER N WITH FLOURISH  (an old friend if constructed with minims), LATIN SMALL LETTER T ROTUNDA ꞇ” for the basics, or more advanced, “LATIN ABBREVIATION SIGN SMALL CON DESCENDING ꝯ, LATIN ABBREVIATION SIGN SMALL IS ꝭ, BREVE BELOW”, all sorts of combining diacritics, contextual spacing modifiers, and last but not least, 6 different spaces.

What I’m hinting at:
A graphemic transliteration table could be constructed. We still do not care about meaning, we are simply looking for allographs, “alike looking glyphs”. There is no need to settle on a singular verdict for a glyph, on the contrary, we are noting down variants. This will be very helpful later on, as well as describing the glyphs verbosely.

While the MUFI recommendation contains a lot of latin ligatures as code points, Unicode discourages the addition of new ligatures.
Contemporary font standards, mostly OpenType and SIL Graphite allow for the composition of ligatures as part of smart font features. So we would try to express as many of the more complex VMS signs as contextual ligatures, eventually making use of complex text layout. There are a lot of possibilities. It may be up to judgement in some cases. But of course this means we are also in need a font supporting this.

We have constructed a sieve, and what rests within are unique, unknown glyphs. Did we wish to encode VMScript, these would go to a PUA, a private use area of Unicode, preferably taking unpopulated code points.

Why is this of significance?
A Unicode transliteration table would in turn allow us to create a scholarly acceptable, palæographic (also: allographic) transcription of the VMS. There is still no diplomacy, no expansion of abbreviations, no judgment on meaning, we simply record what we see.
The difference is, that the recorded glyphs do not map to ASCII “a”, “Z”, “#”, “/”, etc., like EVA does, but to their respective code points. This should help to avoid the misunderstandings EVA encourages, and allow scholars & information scientists alike to work with a reliable transcription. Wonderful things could be done, like judging allographic variation by mean distribution, e.g. “-is” vs. “-ris” (good that we noted variants before).

Of course a lot depends on the transcription itself, and there are numerous options how to tackle this. Accepted scholarly standards exist and should largely be followed or extended upon. Open Source software for collaborative work exists and needs to be evaluated. This shall be elaborated on in a follow-up post.

Leave a Reply