CLIX
From LMNLWiki
CLIX stands for Canonical LMNL In XML. It is one way to encode a LMNL data model into an XML 1.0 document.
- The idea behind CLIX was first proposed by Steve DeRose at Extreme Markup Languages 2004 (Montréal)[1], based on work at OSIS (the Open Scripture Information Standard). Further refinements were proposed in 2005 by Syd Bauman under the name "HORSE".[2] Since we have since developed the premise of a canonical representation for LMNL in XML, these earlier proposals are now better reflected in ECLIX than in CLIX per se.
CLIX provides a fully general biunique mapping between a LMNL data model and an XML document, up to namespace prefixes. ECLIX is an extension of CLIX to provide more natural XML representations of LMNL when possible.
This document describes how to serialize a LMNL data model as a CLIX document; the reverse process should be easy to deduce.
Some CLIX examples are available: see the LMNL Examples page.
More about CLIX can be found on the XML Pipelining page. Also see ECLIX.
Also relevant to this specification is XML-safe LMNL.
Contents |
General Concept
The general concept of CLIX mapping is that each range tag and atom (other than a character atom) is represented by an XML element. The expanded name of the range or atom becomes the element name using the mapping discussed above.
If the tag or atom has no annotations, the element is empty; if any annotations exist, they are placed in order within the content of the element. Annotations within annotations likewise go into the element that represents the outer annotation.
Root Element and Namespaces
The CLIX namespace is used for the root element of every CLIX document, clix:clix. The namespace name is http://lmnl.net/clix. All namespaces used in the LMNL model that is being serialized as CLIX must also be declared on the clix:clix element. They may use the same prefix as in the original LMNL document, or any other prefixes chosen by the implementation.
LMNL names in the default-default namespace are represented using XML names in no namespace.
If the CLIX namespace is used in the LMNL document, you lose.
The xml:base attribute can appear on the clix:clix element to specify the base URI of the LMNL data model.
CLIX Attributes
The clix:role attribute indicates the role of an XML element in representing a LMNL data model. Possible values are start-range, end-range, start-annotation, end-annotation, and atom. Annotations are expressed in the same way as ranges.
The clix:sID attribute is used in an element that indicates the start of a range or annotation. It is an XML ID. If this attribute is present, a role of start-range is implied.
The clix:eID attribute is used in an element that indicates the end of a range or annotation. It is an XML IDREF, and therefore must match the clix:sID attribute in another element in the CLIX document. If this attribute is present, a role of end-range is implied.
Identifiers
Every CLIX range and annotation must express have a document-unique ID expressed as the value of a clix:sID and a clix:eID corresponding to its LMNL range ID.
Anonymous Objects
Elements representing anonymous ranges, annotations, or atoms
have the name clix:anon.
Character Atoms
A LMNL atom representing a character, and having no other annotations, is represented as an XML character provided it is allowed in XML 1.0 character content. If not, it is represented like other atoms.
Examples
todo
CLIX vs. XML Events
Differences between CLIX and Events Markup syntaxes:
- In Events Markup, the LMNL element name is carried in XML attributes, and the XML element name indicates the type of event. In CLIX, the LMNL element name is (almost always) the XML element name, and the type of event is carried in an attribute.
- In CLIX, start-tag-open and end-tag-open are mapped onto XML start tags and start-tag-close and end-tag-close are mapped onto XML end tags, rather than all four being mapped to XML empty elements.
- There is nothing in CLIX directly corresponding to the Events Markup text element; mixed content is used instead.
- (more?)
CLIX vs. ECLIX
Because start- and end-range tags are represented by elements in CLIX, it is noteworthy that a CLIX rendition of a LMNL document is "flat" with respect to its elements and text. To the extent the CLIX (XML) document is nested, it represents the attachment of annotations (with their annotations), not any nesting ("containment" or enclosure) of LMNL ranges within one another.
Users of XML may prefer a more conventional XML in which nested elements represent directly the nesting of structures within structures (paragraphs within chapters, items within lists etc.). Rendering such documents, with ad-hoc markers (XML milestones or "CLIX") for ranges that do not fit within the hierarchy is what ECLIX is for.
Open Issues
If the CLIX namespace is used in a LMNL document, it can't be serialized.
What happens if a name contains characters other than XML 1.0 name characters? Fall back to XML 1.1? Use a clix:name attribute to hold the true name?
Are the differences between CLIX and Events Markup sufficiently important that we need both?
