Overview of XML-related
standards
|
|
|
|
|
Steven J. DeRose, Ph.D. |
|
Brown University
Scholarly Technology Group
Steven_DeRose@brown.edu
http://www.stg.brown.edu/~sjd |
|
|
XML and related specs
|
|
|
|
XML: The basic syntax |
|
Plus Namespaces, Schemas, InfoSet |
|
DOM: API to the Information Set |
|
XML Linking |
|
XPath: Expressions to find XML nodes |
|
XPointer: XPath++ for addressing |
|
XLink: hypermedia connections |
|
Stylesheet Attachment |
|
XSL: stylesheets and transforms |
|
|
|
|
XML specification
|
|
|
|
A “Recommendation” since 2/1998 |
|
The highest level for a W3C
specification |
|
Defines the syntax/grammar |
|
Not any particular processing/semantics |
|
Schemas or DTDs define applications
(poem, manual, eCommerce,...) |
|
All these can be parsed by generic XML,
just as new words can be readily fitted into existing sentence structures |
|
Schemas are political as well as
technical |
XML Namespaces
|
|
|
|
Disambiguate element type names |
|
<head><html:title>Oncataloging</html:title>…
<biblio><entry id='DeRo98'>
<loc:title>Navigation,
Access, Control… |
|
Declaring prefixes |
|
<sec xmlns:loc="http://foo.com/mynamesp”
xmlns:html='http://www.w3.org/1999/xhtml'
xmlns="http://…">
<loc:title>… |
|
Declaration without prefix sets default |
|
Attributes can have namespaces |
|
No renaming (x:foo to y:bar) |
XML Schemas
|
|
|
|
Let you define a document type |
|
What elements/attributes are defined? |
|
Where can they occur? |
|
What content is allowed? |
|
What datatypes are represented? |
|
Required for validation |
|
Similar to DTDs, but |
|
More powerful (esp. for datatyping) |
|
Use XML syntax |
XML Information Set
|
|
|
|
What data in XML document “counts”? |
|
Elements, attributes, content |
|
Order and hierarchy of nodes |
|
Required for interoperability |
|
Applications must count nodes
consistently |
|
Not whitespace inside tags |
|
Not which kind of quotes around
attributes |
|
Candidate recommendation 2001-05-14 |
|
http://www.w3.org/TR/xml-infoset |
|
|
7 types of Infoset Nodes
|
|
|
|
Root: Above the document |
|
<?foo ?>
<doc>…</doc>
<!-- hi --> |
|
Element: Main structure |
|
<div n='1'>…</div> |
|
Text: Spans of unbroken text |
|
Attribute: Properties of
elements |
|
Namespace: Prefixes/URIs |
|
Processing Instr: <?…?> |
|
Comment: <!-- … --> |
Example
More Infoset details
DOM
|
|
|
|
"Document Object Model" |
|
An API for accessing the Infoset |
|
Many tools use this |
|
Level 1 complete |
|
http://www.w3.org/TR/REC-DOM-Level-1 |
|
Level 2 core complete |
|
http://www.w3.org/TR/DOM-Level-2-Core |
XML Base
|
|
|
|
Similar to the HTML <base>
element |
|
Useful for keeping URIs simpler and
uniform. |
|
Applies to relative URLs |
|
<html>
<head>
<base
href="http://www.example.com/">
…</head> |
|
<body>… <a
href="fig/mosquito.png"> |
|
The hrefs combine to make whole URI: |
|
http://www.example.com/fig/mosquito.png |
XML Base
|
|
|
|
XML Base provides similar feature |
|
By a reserved attribute |
|
<?xml version="1.0"?> |
|
<doc xml:base="http://eg.org/today/"> |
|
See <link xlink:type="simple" xlink:href="new.xml">the
news</link> |
|
Applies to attributes & descendants |
|
Can be overridden on descendants |
|
Final REC as of 2001-06-27 |
|
http://www.w3.org/TR/xmlbase/ |
Stylesheet attachment
|
|
|
|
Lets documents point to stylesheets |
|
Based on HTML <link
type='stylesheet'> |
|
Multiple, anywhere in XML prolog |
|
May point to CSS, XSL, etc. |
|
Example: |
|
<?xml-stylesheet
alternate="yes" href= "mystyle.css"
title="Medium" type="text/css"?> |
|
Equivalent of HTML:
<LINK
href="mystyle.css" title="Medium" rel="alternate
stylesheet" type="text/css"> |
|
REC: http://www.w3.org/TR/xml-stylesheet |
|
|
|
|
XSL specification
|
|
|
|
|
Stylesheet language |
|
Based on ISO DSSSL and W3C CSS |
|
2 major pieces: |
|
XSLT: document transformation |
|
Builds on XPath (more later) |
|
Match elements, then construct output |
|
XSL-FO: Formatting objects |
|
To actually render blocks, fonts,
tables, etc. |
|
Hypermedia support unfinished (=CSS) |
|
http://www.w3.org/TR/xsl/ |
Current XML organization
|
|
|
XML Plenary coordinates
several WGs |
XML-Linking specifications
|
|
|
|
XPath: expressions on infoset nodes |
|
REC: http://www.w3.org/TR/xpath |
|
XPointer: XPath + ranges, in URIs |
|
CR: http://www.w3.org/TR/WD-xptr |
|
XLink: gather locations to make links |
|
REC: http://www.w3.org/TR/xlink/ |
|
(XML Base) |
XML-Linking goals: end user
|
|
|
|
|
Links from un-writable documents |
|
Which is most of the Web, for any
person |
|
Perhaps the most important single
feature |
|
->Bidirectional and multi-ended
links |
|
->Annotations and annotation sharing |
|
Dynamic updates, patches, highlighting |
|
Precise link attachment in any media |
|
Large sets/databases of managed links |
|
An entirely new market for links per se |
|
Anyone can publish/sell their
commentary |
Pointing vs. linking
|
|
|
|
|
In HTML, many things are combined: |
|
<a
href="eg.org/foo">wow</a> |
|
Technically: |
|
"eg.org/foo" is a pointer
(namely a URI) |
|
The abstract connection itself is the link |
|
The <a> element is a link
representation |
|
"wow" is the local anchor |
|
Anchors are also called link-ends |
|
Data at eg.org is the remote anchor |
|
HTML specifies the link behavior |
|
|
XPointer: locators
XLink: connections
|
|
|
|
|
Describes a relationship
of referenced location(s), |
|
To each other |
|
To descriptions |
|
XLink provides
some key ones |
XPointer…
|
|
|
|
Locates parts of XML resources |
|
Even things without IDs |
|
Even things that aren't whole nodes |
|
XPointer adds (beyond XPath): |
|
Way to refer to point and range
selections |
|
Way to use inside URI fragment
identifiers |
|
TEI “extended pointer” notation plus
XPath logical expressions |
|
Typically, a browser might load a
document and scroll to/highlight the part |
|
|
Anatomy of a URI reference
Fragment identifiers
|
|
|
|
Part of URIs after "#" |
|
Says where in document is actual target |
|
Separate form for each media type |
|
Identifiers for graphics ¹ for text |
|
IETF MIME definition specifies form |
|
HTML |
|
To scroll to <a
name="coyote"> |
|
|
|
http://example.com/hello.html#coyote |
|
|
The 3 XPointer/XPath forms
|
|
|
|
|
Bare names |
|
An XML "name"* finds element
with that ID |
|
For (X)HTML compatibility |
|
HTML uses "NAME", not ID |
|
Child sequences |
|
Stepwise down through elements:
/1/4/27/2 |
|
May start with an ID: intro/4/3/2 |
|
Full XPointers |
|
scheme1(args) scheme2(args)… |
|
For now, the only "scheme" is
"xpointer" |
XPointer's 2 parts
|
|
|
|
Provide 'scheme' mechanism |
|
Identify media-specific pointer types |
|
Allow multiple ones to co-exist |
|
Pointing methods for XML |
|
Point to ranges, sets, id's, coords… |
|
Point descriptively |
|
|
XPointer schemes
|
|
|
|
|
Each media type needs pointer type |
|
pngRect(0,10 100,200) |
|
vrml(camera=1,2,3 light=4,50,500) |
|
map(W0°10’/ N51°30’) |
|
Xml(…) |
|
Schemes label fragment identifier types |
|
#scheme1(args) scheme2(args)… |
|
Escape any extra ( ) -- tlg('^(apax') |
|
XPointer() is the first scheme |
|
|
Multiple schemes in a URL?
|
|
|
|
When a server responds to a URI, it |
|
Checks what media the client can handle |
|
Picks one of those to send |
|
“content negotiation” |
|
If a visually-impaired user clicks |
|
<a
href="http://www.example.com/foo.gif# gif(0,0 1,1) xpointer(id(chap1))"> |
|
The server may fall back to an XML file |
|
The client tries fragment identifiers
left-to-right, and uses the first one that works |
Anatomy of a location step
Summary: axes and functions
|
|
|
|
|
|
root( ), id( ) |
|
parent, self, child |
|
ancestor, ancestor-or-self |
|
descendant, descendant-or-self |
|
preceding-, following-sibling |
|
preceding, following |
|
attribute, namespace |
|
here( ), origin( ) |
|
String-range(), range-to() |
Counting locations
Points and Ranges
|
|
|
|
|
Point |
|
What you get by click-selection |
|
Gap before/after node or char |
|
Range |
|
What you get by drag-selection |
|
From a start point to an end point |
|
Not generally a WF XML subtree |
|
May partially contain some elements: |
|
<p>Hello,
world.</p><p>Hi, back</p> |
|
Crucial for creating hypertext links |
|
How often do you click/drag exactly one
entire element? |
|
|
XLink is a language that...
|
|
|
|
Lets you invent your own linking
elements and their meanings |
|
In keeping with XML approach overall |
|
Lets you create link databases |
|
Links become first-class objects in the
model |
|
Provides some basic traversal behavior |
|
E.g., “Open the target in a new window” |
|
The rest is left to a style mechanism
such as XSL |
XLink terminology
|
|
|
|
Linking element |
|
Identifies, connects, and describes
anchors |
|
Locator |
|
Locateses some link end (anchor)’s data |
|
Link end or anchor |
|
A data portion reachable as part of a
link |
|
Arc |
|
Explicit connection between two link
ends |
|
Resource |
|
Anything you can point at on the Web |
|
Using an arc is called Traversal |
What links do with link-ends
|
|
|
|
A link identifies where its ends are |
|
Using some kind of locators |
|
URI#XPointer will be the locator for
XML |
|
URI#scheme()scheme() in general |
|
A link attaches metadata to each end |
|
Its formal role in relation to the
other ends |
|
A title by which to refer to it (say,
in menus) |
|
Some traversal behaviors |
|
Arcs to say which traversals happen |
|
Link itself can also have type, other
info |
Inline links
|
|
|
Linking element itself (better, the
origin() end) is one of the link’s ends |
Out-of-line links
|
|
|
Linking element itself isn't
automatically made into one of its own resources |
Anatomy of an XML link
Arcs
|
|
|
|
Arcs specify traversal rules |
|
Multi-ended links may restrict travel
among their endpoints |
|
Restrictions generic or app-specific |
|
Arcs enable the description of both |
|
An arc is a pair of roles, plus
metadata |
|
Enables traversal between ends with the
given roles |
|
May be multiple locators per role
(useful for document assembly, multiple-choice travel) |
Arc example: fuel-type
annotations
How to detect links
|
|
|
|
Could have any name and content at all |
|
<footnote>, <criticism>, … |
|
xlink:type attribute marks linking
elements for applications to find: |
|
<!ELEMENT footnote EMPTY>
<!ATTLIST footnote
xlink:type CDATA
#FIXED "simple"
xlink:href CDATA #REQUIRED> |
|
For example: ...has studied the
issue.<footnote href="http://www.doctools.com" /> |
Arcs and Traversals
|
|
|
|
|
Traversal is split into: |
|
Behavior |
|
Author's intention for behavior of a
link. |
|
Input to style mechanism |
|
Not a presentation command |
|
Actuation |
|
Defines the event that triggers a link |
|
Events are very generic, intentionally |
Two kinds of behavior
policies
|
|
|
|
|
show attribute |
|
new to traverse and provide new
“context” |
|
replace to display in existing
“context” |
|
embed to display in the body of the
initiating resource |
|
Some semantic details are left
unspecified: combining multiple ends, style inheritance, etc. |
|
actuate attribute |
|
onRequest to require external request |
|
onLoad to traverse when link processed |
Link databases let you…
|
|
|
|
Attach descriptive information from
afar |
|
Annotate other people's stuff |
|
Maintain links more easily |
|
When a destination changes, you don’t
have to touch documents with links to it |
|
Engage in online commerce in links |
|
Express, package, and sell
point-of-view |
|
Collect out of line links as databases |
External Linksets
|
|
|
|
Users will have persistent linkdbs |
|
Subscriptions, interest groups,
private,... |
|
Document can specify relevant link dbs |
|
Linked by special type of extended link |
|
Included within regular documents too |
|
LinkDBs enable link management |
|
Needed to author using external links |
|
|
|
Example: Public annotations on…. |
An external Linkset Instance
|
|
|
<xls>
<linkbase xlink:href="linkset1.xml" />
<linkbase xlink:href="linkset2.xml" />
<linkbase xlink:href="linkset3.xml" /> |
|
</xls> |
|
|