XML Pipelining

From LMNLWiki

Pipelining XML into LMNL

The standardized ECLIX and CLIX representations of LMNL documents make it possible with relatively little fuss (generally with a simple transformation) to convert any XML document in which overlap is represented into a form available in a LMNL processing framework. Using XML pipelining methods, LMNL processing can be performed over document sets maintained in XML.

ECLIX is very close in design to representations of overlapping structures in XML using the so-called "milestone" strategy, in which XML elements (or other structures such as processing instructions) are taken as start- and end-delimiters of ranges whose content spans across multiple elements, or segments of elements, within the XML hierarchy. ECLIX is also a straightforward target for conversion from the other common representation of overlap, segmented and aligned elements. An XSLT transformation to convert any arbitrary convention for representing overlap in XML into ECLIX will usually contain only a few simple templates.

CLIX format is somewhat closer to a straightforward rendition of the LMNL model in XML. Any ECLIX document can be converted into a CLIX document, isomorphic with respect to the LMNL model, using an off-the-shelf XSLT transformation.

Download a package of stylesheets implementing this framework at http://www.lmnl.org/files/LMNL-XMLPipelining-xsl.zip. These transformations are coded in XSLT 2.0 and have been tested with Saxon 8.9.

Proposed architecture for XML Pipelining into LMNL

Image:XMLPipelineArchitecture20070816.jpg

Dashed lines indicate transformations or conversions that must be implemented by or on behalf of the user. Solid lines indicate generic transforms that work off the shelf.

Note: as of the date of writing, this architecture has been implemented via XSLT transforms that produce an XML representation of LMNL whose design (although it has been successfully demonstrated) is still subject to refinement. The question mark "?" in the diagram indicates where the LMNL data model was still soft when this graphic was created.