TCW01: Proposed charter for Feature Structure Workgroup
Contents
- Background
- Proposal
- Constraints resulting from being a TEI working group
- Constraints resulting from becoming an ISO TC37/SC4 work item
- Budget issues
- Timetable
- Proposed participants
Background
The TEI Guidelines contain two complementary chapters on Feature Structure Representation (chapter 16) and Feature Structure Declaration Representation (chapter 26). Those two chapters, originally drafted by Terry Langendoen and Gary Simons, are now generally recognized as covering many needs in the field of linguistics.
Moreover, some Natural Language Processing applications have provided implementations based on the proposed representations, which has further increased interest on this specific aspect of the TEI in this community. It should also be noted that these chapters are closely integrated with the rest of the TEI scheme, thus opening up the application of NLP techniques to a very wide community of users, while at the same time offering the NLP community access to a real-world range of different text types and applications
Proposal
It is suggested that a revision of the two chapters be conducted under the auspices of TC37/SC4, and that the corresponding editorial group 2 be seen as a working group from the point of view of the TEI.
The most practical way to do so is simply that the TEI should become an official liaison to TC37/SC4. As such, it would have access to all documents and debates of the group.
The intention is to work in a more coordinated way than previously occurred with the terminology working group, where no continuous reporting to the TEI occurred.
Constraints resulting from being a TEI working group
- Clarification of alternative strategies. For example, there exist at present two ways of alternating between features and values (the ad hoc vAlt and fAlc vs. the general alt mechanism): the work group should provide guidance on the circumstances in which either method might be preferable;
- Compatibility with the final proposals on the use of pointers and links from the stand-off markup TEI working group. It should be noted that, more generally, a close collaboration between TC37/SC4 and the TEI must be established on this topic;
Constraints resulting from becoming an ISO TC37/SC4 work item
- It would be necessary to produce a distinct document, independent of the general TEI document architecture. Feature structures would be the actual entry points and some global attributes may have to be dropped.
- It would oblige the document to be conformant to the general structure of an ISO standard (with scope, terms and definitions, etc.);
- It would give additional international visibility to the document, since it would be widely distributed to TC37/SC4 members, thus providing additional sources of comments and suggestions for modification;
- Preliminary formal description of feature structures additional to what is already provided
- Provision of a simplified representation (FS lite) describing the basic subset of FS representation without libraries;
- Provision of a re-entrance mechanism;
- Description of typed feature structure
- Simplification of feature value content by replacing some elements (<symb>, <num> etc.) with references to types (cf. XML schemas)
- Provision of more NLP related examples (e.g. a syntactic analysis using HPSG, a semantic analysis, etc.)
Budget issues
No specific budget will be requested from the TEI to support those activities, since they will be incorporated into normal ISO TC37/SC4 activities. Experts participate in ISO work groups on a national basis.
As a liaison, the TEI will send specified representatives to all editorial group meetings (presumably, one of the two editors).
Should a proposal be made to an EU call to fund those activities, the TEI council would be informed.
Timetable
- February 2003: a New Work Item Proposal (NP) together with a first draft is sent to TC37/SC4 members for approval and comments. This is a good opportunity to check that the liaison mechanism works
- May 2003: result of voting and compilation of comments
- July 2003: SC4 meeting in conjunction with ACL conference in Sapporo. Study of comments and decision to proceed to Committee Draft (CD), or, (if things are going well) as a Draft of International Standard (DIS)
- November 2003: DIS document, incorporating Sapporo comments, is sent for vote to TC37 members
- Dec. 2004: Publication.
A key factor in this timetable will be the speed of reaction to comments and provision of updated drafts after plenary SC4 meetings.
Proposed participants
Editor: Kiyong Lee klee@korea.ac.kr
- Syd Bauman Syd_Bauman@brown.edu
- Lou Burnard lou.burnard@oucs.ox.ac.uk
- Eric de la Clergerie Eric.De-La-Clergerie@inria.fr
- Tomaz Erjavec Tomaz.Erjavec@ijs.si
- Ulrich Heid Ulrich.Heid@IMS.Uni-Stuttgart.DE
- Terry Langendoen langendt@U.Arizona.EDU
- Laurent Romary Laurent.Romary@loria.fr
- Gary Simons Gary_Simons@sil.org