Talk:ECLIX

From LMNLWiki

I like the idea of specifying this separately from CLIX as such.

Some thoughts:

  • Annotation order:
    • When parsing ECLIX into LMNL, no problem. The attributes come out as annotations in some (undefined) order.
    • When writing LMNL as ECLIX - get a warning about attribute order unless you have a (Creole-like?) assertion of some kind that order of annotations doesn't matter
  • Removing/reducing redundant pairs of sID/eID: making this part of CLIX would be good, I think. But currently as I understand it, the very presence of an sID/eID in CLIX warrants that the element is actually a CLIX event, so isn't this already implicit in CLIX? (Maybe not in our version.)
  • Representing ranges as elements: sounds good. Creole should be a help for this too. I'm envisioning a way of asserting that some ranges (types of range) are to be rendered as ECLIX elements, and validating the LMNL against such assertions.
  • Formally supersetting XML: yeah!

--Wendell 13:16, 16 September 2006 (EDT)

Right. Say we have

[foo}...[bar}...{foo]...{bar]

There are two ECLIX representations of this:

<foo>...<bar clx:sID="b" />...</foo>...<bar clx:eID="b" />
<foo clx:sID="f" />...<bar>...<foo clx:eID="f" />...</bar>

Which one is appropriate depends on what view you want of the document: are foos or bars the main things in the document? We need some way of specifying "views" to determine how the LMNL is translated into ECLIX. As Wendell says, one way would be through annotations on a Creole structure. Something like:

<interleave>
  <range name="foo" clx:elements="foo-view">
    <text />
  </range>
  <range name="bar" clx:elements="bar-view">
    <text />
  </range>
</interleave>

The clx:elements attribute holds a list of views in which the specified range is an element; if a particular view isn't specified in clx:elements then the relevant range is represented as a milestone in that view. When transforming the LMNL to ECLIX, you would specify which view you wanted (foo-view or bar-view) and the ranges would be transformed into elements or milestones as appropriate. You could have similar attributes on <annotation> elements to indicate whether the annotation should be turned into an attribute or an element.

Would that be a usable approach? — Jeni 08:25, 2 October 2006 (EDT)

I think so, but we'll have to try. (Sorry for the belated comments even if they're just cheerleading) --Wendell 10:15, 19 November 2006 (EST)

LIX

I wonder whether it would be more accurate to call ECLIX "LIX" (LMNL in XML), with CLIX being the canonical variant.

LIX could be designed as follows:

  1. Every XML document is a LIX document
  2. Every CLIX document is a LIX document
  3. The role of each element is identified by a lix:role attribute. LIX is designed such that any XML document can be turned into an equivalent LMNL document by adding lix:role attributes to the elements. Valid roles are
    • atom
    • range
    • start-range
    • end-range
    • annotation
    • start-annotation
    • end-annotation
    • wrapper: used for elements that are useful in XML, primarily for separating meta-data from content, but unnecessary in LMNL, such as the <head> element in HTML
  4. If it's omitted, the default value for lix:role depends on the context of the element:
    • if the parent element has a role of atom, start-range, end-range, start-annotation or end-annotation then the element has a default role of annotation
    • otherwise, the default role is range
  5. By default, an element with a role of start-range is matched with the next element with a role of end-range with the same name. The lix:sID and lix:eID attributes are only necessary if the match needs to be with a different end range.
  6. Attributes are converted into annotations
  7. Elements representing annotations or start and end annotations can appear as the first children or last children of an element representing a range (and wrappers are ignored when making this assessment)

Translation from LIX to LMNL is not reversible (wrappers are not preserved, for example). However, translation from LMNL to CLIX is.

Example:

<html>
  <head lix:role="wrapper">
    <title lix:role="annotation">Example LIX to LMNL translation</title>
  </head>
  <body lix:role="wrapper">
    <h1>Example LIX to LMNL translation</h1>
    <p id="para">
      <span class="phrase">Here is a <span class="phrase" lix:role="start-range" />simple 
      paragraph</span> with overlapping ranges<span class="phrase" lix:role="end-range" />.
     </p>
  </body>
</html>

becomes:

[html [title}Example LIX to LMNL translation{title]}
  [h1}Example LIX to LMNL translation{h1]
  [p [id}para{]}
    [span=p1 [class}phrase{]}Here is a [span=p2 [class]phrase{]}simple 
    paragraph{span=p1] with overlapping ranges{span=p2].
  {p]
{html]
My thinking about this whole idea has been refined considerably since I first read (and approved of) this proposal. In general I still really like it, as a clean way of mapping XML to LMNL. But I think ECLIX as presently specified is more clearly aligned with current XML practice (milestone-delimited overlap), inasmuch as it is a specific and validable variant of a milestone-delimiting format that can be converted into CLIX "lights-out", with an off-the-shelf transformation.
Accordingly, I've adopted the term LIX as a generic term encompassing *any* XML representation of ranges including overlapping ranges, as mapped to LMNL. This proposal can be accorded special status as a way of designing a transformation (or attribution regimen) to do this; but ECLIX still has a role (and indeed, converting any arbitrary XML/LIX to ECLIX is somewhat easier to engineer than converting it to CLIX).

-- Wendell 14:56, 21 September 2007 (BST)