Metadata Matters
|
|
|
Using TEI Headers to Document a Text |
Metadata requirements: the
scope
|
|
|
Identification of the object |
|
Documentation of its structure and
organization |
|
Statement of rights (reproduction,
ownership etc.) |
|
Statement of intended usage |
|
Documentation of interpretive scheme/s
applied |
|
Brief characterization for search
engines |
The TEI header
|
|
|
|
Based on AACR2 practice, the header
contains: |
|
mandatory file description |
|
optional encoding, profile and revision
descriptions |
|
The TEI envisages only two “levels” of
header |
|
The corpus header |
|
The text header |
|
|
TEI Header structure
For example
The file description
|
|
|
Mandatory |
|
Supplies full description of the
electronic file itself, and its source/s |
|
Must specify at least a title, a
publication statement, and a source |
|
Use of authority control is advisable
but not required |
The File Description
The title statement
|
|
|
mandatory <title> [245] |
|
identifies the electronic file, not its
source |
|
optionally followed by statements of
responsibility, as appropriate, using
<author>,
<editor>, <respStmt> [720], <sponsor>,
<funder>, <principal> [536] |
A real title statement
The publication statement
Edition and extent statements
|
|
|
As elsewhere, distinguishing a new
“edition” from just another version may be difficult… |
|
Extent should be expressed in some
platform independent way, e.g. words, Kb. |
Sample publication statement
The source description
|
|
|
|
may contain common TEI bibliographic
elements <bibl>, <biblStruct>, |
|
or a nested file description <biblFull> |
|
or a list <listBibl> |
|
or a prose description |
|
specialised elements for transcribed
speech |
|
Or (for the “born-digital” document)
simply the text “Original” |
Sample source description
Typical BNC source
statements
The Encoding Description
|
|
|
|
documents what was done in creating the
electronic form of a text |
Editorial Declarations
|
|
|
|
document policies for some key
decisions: |
|
<correction>,
<normalization>, <quotation>, <hyphenation>,
<segmentation>, <interpretation> |
|
declarable (can vary) |
|
attributes are suggested for some key
aspects; can just contain prose description |
Example encoding description
The Profile Description
Sample profile description
Language
|
|
|
global lang attribute |
|
references a <language> element |
|
defines both natural language and
writing system |
BNC Participant Description
Setting Descriptions
Text classification
|
|
|
|
Texts may be classified by one or more |
|
topic-descriptive keywords: <keyWords> |
|
(public) classification codes: <textClass> |
|
or by (private) classification or
typology: <catRef> |
|
Classification schemes may be |
|
explicitly defined in the <classDecl> element |
|
implied by reference in source
attribute |
|
Again, authority control is not
mandatory (but recommended!) |
Using a classification-1
Using a classification-2
The revision description
Sample revision description
Issues in using the Header
|
|
|
originally intended for non-specialist
use |
|
application profiles for particular
uses |
|
independent headers |
|
up/down translation |
Crosswalks
Harmonization of practice
|
|
|
TEI Header is widely deployed, but in
varying dialects |
|
Several attempts have been made to
define standard “codes of practice” for digital libraries using it (e.g. by
LC, AHDS) |
|
It is likely to remain the best “source
of information” for cataloguing use, whether or not it is also the best means
of expression. |
Creating and managing
headers
|
|
|
|
As an independent XML document |
|
Within a database |
|
special case of common problem |
|
OTA experience |
|
all data in XML headers |
|
harvestable by Z39.50 client |
|
OLAC compatibility is a SMOP |