XML-safe LMNL
From LMNLWiki
For the time being, there may be interest in a subset of LMNL suitable for a straightforward casting into XML. For these purposes, we assume that the most interesting and useful feature of LMNL is its representation of overlap, and that other features (significantly, structured annotations) can be done without or worked around when working in an XML context. Given a representation of overlapping ranges in XML using any of a number of methods (such as ECLIX, segmented elements or milestones), a LMNL document can be considered "XML-safe" if it observes the following restrictions:
- There is an explicit range enclosing the entire document, which can be mapped to an XML document element. (Note: if this restriction is not met, an XML-safe LMNL document may still map to an XML external parsed entity.)
- Names of ranges, annotations and atoms all conform to XML naming rules; in particular, there are no anonymous ranges, annotations or atoms.
- Annotation order is not significant on any range.
- Annotations are uniquely named within their owner range.
- Annotations have only simple plain-text values, without ranges.
- Annotations have no annotations.
A LMNL document that conforms to these restrictions can be mapped to XML with ranges represented as elements (albeit segmented or milestoned) and their annotations represented as attributes on the elements. This specification, however, does not determine how this mapping is to occur, and in particular which of many potential dominance hierarchies among ranges in a LMNL document should be explicitly represented in the XML element structure.
With the exception of the rule that annotation order is not significant (a constraint on the semantics not the model of a LMNL document), these restrictions may be validated statically.
Note that the intention is that ECLIX and CLIX will nevertheless provide conventions for representing LMNL that is not XML-safe according to these rules.
Incidentally, these restrictions may also be useful in defining mappings from LMNL (or rather, a defined subset of LMNL) into other representations of overlapping structures over text, such as TexMECS, XConcur, Sekimo or many XML-based standoff annotation mechanisms.
— Wendell 15:50, 16 December 2008 (UTC)
