TEI

Text Encoding Initiative

Variant letter forms


Next

Variant letter forms — high and round s, for example — are often distinguished in transcriptions of manuscripts and early printed materials. Texts which are semi-diplomatic or semi-normalised will general distinguish only between variant letter forms which are felt to have a basis in phonology distinctions, which the two forms of /s/, for example, do not. For statistical purposes it may be desirable to transcribe other palaeographical or typographical distinctions.

Variant letter forms, and indeed any exotic (read non-English) characters, can be represented using entity references, which may be given as a numeric entity reference in the Universal Character Set developed by the Unicode Consortium, or using a standardised name which is defined with reference to the Unicode standard in the DOCTYPE declaration subset, as in the following example:

<!DOCTYPE TEI SYSTEM "tei.dtd" [
<!ENTITY aelig "&#x00E6;">
<!ENTITY AElig "&#x00C6;">
<!ENTITY aeligacute "&#x01FD;">
<!ENTITY avlig "&#xEF97;">
<!ENTITY avligacute "&#xEFE7;">
]>

If one wished to distinguish between different allographs of a single letter or other palaeographical features for purposes of statistical analysis, one could define entity references for this purpose, &b1;, &b2; and so on.