.sr docfile = &sysfnam. ;.sr docversion = quiet;.im teigmlp1 .* Document proper begins. Introductory Notes on the TEI Guidelines <title>3: Core Structural Tags </titlep> </frontm> <!> <body> With a little of the overall goals and a basic introduction to SGML out of the way, it's possible now to begin discussing the content of the current draft of the TEI guidelines. This contribution will cover what the draft calls the <q>core structural features</q> of text. By <term>core features</term> we mean features which appear in many kinds of text and are used in many kinds of analysis. Typically these are rendered by some typographic or manuscript signals. By <term>structural features</term> we mean features relating to the overall formal structure of a document. In prose these will be things like sections, chapters, and paragraphs; in verse they will be cantos and stanzas; in drama they will often be acts and scenes. <p> A TEI document is divided first into two pieces: the <term>TEI Header</term>, which says what text this is and who encoded it, and the <term>text</term> itself, which is what we take over from our source or copy text. The TEI header contains a bibliographic description of the electronic text and its source; it does not concern us further here. The text itself is divided, for convenience, into three parts: front matter (optional), body (required), and back matter (optional). <p> The overall organization of a TEI text is thus: <xmp> TEI.1 TEI.header text front body back </xmp> <h1>Front Matter <p> Front matter can be broken down into the following pieces: <ul> <li>title.page <li>foreword <li>acknowledgements <li>dedication <li>abstract <li>contents <li>frontispiece <li>front.part </ul> (where a <term>front.part</term> is any piece of the front matter not already given another name. A <q>List of Abbreviations Used,</q> for example, would be tagged with a <tag>front.part</tag> tag. <p> The title page can contain a number of other elements: <ul> <li>doc.title (the title of the document) <li>doc.author <li>doc.imprint (publisher, printer, etc. as given on title page) <li>title.part (any other structural part of a title page) </ul> <h1>Body of the Text <p> For the body of a text, the rules are simple: text is divided into units, which are divided into smaller units, which are divided into still smaller units, until eventually you reach units which contain not other units, but prose. Prose is grouped into paragraphs, so paragraphs are the lowest-level structural units. These units are tagged with a series of tags, <tag>div0</tag>, <tag>div1</tag>, ... <tag>div5</tag>, with the body divided directly into <tag>div0</tag> or <tag>div1</tag> elements. <tag>div0</tag> elements, if present, are divided into <tag>div1</tag> elements, <tag>div1</tag> elements into <tag>div2</tag> elements, etc. <p> Any of these elements may have a heading, tagged <tag>head</tag>, followed by a a series of paragraphs (optional), followed optionally by a series of the next-level tags, followed optionally by a <tag>trailer</tag> element (provided for ms. explicits or tags like <q>End of the fourth section</q>). So the overall structure of the body of a text will be something like this: <xmp> <![ CDATA [ <body> <div1> <head>Title of Chapter 1 <p>Introductory prose for chapter 1. <p>More introductory prose for chapter 1. <div2> <head>Title of Chap. 1 Sec. 1 <p>Prose for 1.1 ... <p>More prose for 1.1 ... <div2> <head>Title of Chap. 1 Sec. 2 <p>Prose for 1.2 ... <p>More prose for 1.2 ... <div1> <head>Title of Chapter 2 <p>Prose for chap. 2 ... <div2> <head>Title of Chap. 2 Sec. 1 <p>Prose ... etc. ]]> </xmp> <p> If we wish to specify that a <tag>div1</tag> is a <term>chapter</term> in this book, rather than a <tag>section</tag>, we can use the <att>name</att> attribute to say so: <xmp> <![ CDATA [ <div1 name=chapter> ]]> </xmp> One of the examples in appendix A of the Guidelines uses <tag>div1</tag> for books of the Bible and <tag>div2</tag> for their subdivisions. In the case of most books, <tag>div2</tag> is a chapter; in the Psalms, of course, it is an individual psalm. (The example uses <att>type</att> as the attribute name, while the text says it should be <att>name</att>. Sorry!) <p> The chapter, section, part or whatever can also be assigned a number using the <att>n</att> attribute. This attribute is defined for all TEI tags, to allow the specification of names or numbers for any element. To identify Psalm 23, for example, the example on pp. 221-222 uses the tag <tag>div2 type=psalm n=23</tag>. <h1>Back Matter <p> Back matter is a somewhat miscellaneous collection -- it may contain most of the same things as the front matter (some publishers, notably on the Continent, for example, placing tables of contents at the end of the book) as well as the following items, which are unlikely to appear anywhere else: <ul> <li>appendix <li>glossary <li>notes <li>bibliography <li>index <li>colophon <li>back.part </ul> The <tag>back.part</tag> element is a generic name for any structural unit of the back matter not otherwise taken care of. Like <tag>appendix</tag> and like <tag>front.part</tag>, it can be subdivided in the same way as a <tag>div1</tag> element. <!> <h1>Comparisons <p> Users of existing markup languages will recognize many of these structural components: <tag>div0</tag> through <tag>div5</tag> are very similar to the <tag>h0</tag>, <tag>h1</tag> etc. of the GML <q>Starter Set</q> (versions of which are supported by the IBM Document Composition Facility and by Waterloo Script, and which also forms the basis for the tag sets in a British Library report and in Appendix E of ISO 8879). Similar block structuring is provided by Scribe and LaTeX, although LaTeX uses specific names (\part, \chapter, \section, \subsection, \subsubsection, etc.) rather than generic names, and does not make the structure of the front and back matter quite as explicit as the TEI does. <p> The TEI tags have the advantage, it seems to me, that they are more general than the LaTeX tags -- it's easy enough to specify <tag>div1 name=act n=1</tag> but it would seem downright odd to me to use <q>\chapter</q> to mark the beginning of Act I. <p> There are some tag sets (e.g. Formex, developed by the EEC) which provide an even more general tagging system for structural units of text. Formex, for example, provides just one tag: <tag>BLK</tag>, which corresponds to any of our <tag>DIVn</tag> tags. </body> <!> </gdoc