Talk:Creole

From LMNLWiki

Contents

Praise

This is great stuff!

  • Very glad you like it.

A couple of suggestions

  • You don't say anything about the namespace(s) of Creole documents. I think the Creole extensions should be in their own namespace and everything else should use the RELAX NG namespace. That way, Creole schemas can be cleverly written so they actually work, sort of, in RELAX NG processors. --John Cowan
  • I think the Creole extensions should be in their own namespace and everything else should use the RELAX NG namespace. That way, Creole schemas can be cleverly written so they actually work, sort of, in RELAX NG processors.
    • Personally, I'd rather adopt the RELAX NG elements into the Creole namespace, because I think it will be painful, when writing Creole schemas, to have to remember which elements need which namespace. Also, one of the features of RELAX NG that I like is that you can use the default namespace for RELAX NG itself, and just declare namespaces (and prefixes) for the namespaces of your elements and attributes, and that would be lost to Creole if it used RELAX NG's namespace for the RELAX NG elements. I'm not convinced that interpreting a Creole schema as a RELAX NG schema would be worthwhile. I can see the point the other way, but you'd lose so much of a Creole schema if you interpreted it with a RELAX NG validator that it wouldn't be much use. Perhaps you can show me an example that will persuade me? --Jeni 16:13, 6 September 2006 (EDT)

Unordering annotations and tags, and interleaving ranges

Hey Jeni,

I can't say enough about how good this looks. Obviously we won't know for sure till we've tried it out, but I can't see anything here that troubles me at first glance.

Among other virtues, one thing I really like here is that you're finding a way to make LMNL "play nice" with XML and element structures. Not only is that very useful but also I think it'll prove to be strategic, especially inasmuch as so much is already being done with milestone markup in XML -- without a way to parse or process it. This looks forward to validating stuff like that, both after and even (maybe even more amazingly) before conversion of such materials into LMNL.

  • Right! I like the way Creole fits with validating normal XML and RELAX NG: it provides an easy way in for users. The only problem is that different markup language make different assumptions about attribute/annotation and tag ordering (in XML, the order of attributes is not significant, but the order of tags is; in LMNL it's the other way around). I've added a new page to discuss these and other issues. --Jeni 18:26, 7 September 2006 (EDT)

Interleaving and overlap

Another change would be much larger, so I propose it for you to consider. If we use the term "overlap" informally, shouldn't we perhaps have another term to designate the formal construct in Creole now being called "overlap"?

  • I'd be very happy to consider a name change for <overlap> because I'm not 100% happy with it either. The point with this pattern is that all its branches must cover all the content, and any range must appear one of the branches, so in many ways it's quite close to CONCUR. Would we be inheriting too much baggage if we called it <concur>? Other possibilities are 'commix', 'commingle' and 'integrate' (too close to 'interleave'?). --Jeni 11:02, 8 September 2006 (EDT)
    • I think "concur" is good, based on the Trex precedent, which used it to mean exactly that. --John Cowan 18:02, 8 September 2006 (EDT)
    • I was hoping you'd like "merge" (which seems to me to describe what it does quite nicely) but "concur" has its virtues. While it does come with some baggage, I agree it is arguably no abuse of the term even in the technical (SGML) sense. (Indeed I dare say JC chose it for TReX in an implicit reference to SGML.) So if you guys like it let's go with it. --Wendell
      • "concur" in Trex is a little different, because it's a proper "and" (every element encountered has to appear in both branches) whereas the current "overlap" allows (indeed, enforces) that a range in one branch doesn't appear in the other branch. But I still like it as a term. — Jeni 03:48, 10 September 2006 (EDT)
      • I looked up "concur" in Trex and I agree, it's not quite the same. That tips me the other way: I still prefer "merge" as a term. It's unencumbered and it describes what the pattern does fairly well, I think. In Trex "concur" means "apply more than one pattern"; in SGML it distinguishes different hierarchies lexically (which we don't want). This is neither of those. --Wendell 18:43, 10 September 2006 (EDT)

I thought of "commingle" and looked at a thesaurus for alternatives to that rather rare and precious term. Possibilities: "merge" (my favorite so far), "mix" (too close to "mixed"), "intermix", "intermingle".

Using an alternative for the name of the pattern, the term "overlap" would be unencumbered and free to mean what happens to ranges using either this pattern or interleave.

While we're at it -- my RelaxNG is not subtle enough to say, but should we consider distinguishing interleave over elements and interleave over ranges as separate patterns? (The latter perhaps being "mingle"?)

  • I think we should try to avoid using separate patterns, because it makes the language less composable.
  • Hmm... your suggestion makes me wonder whether it would be better to have only <range> patterns plus an <adjacent> (or something) group that could be used anywhere to guarantee that certain patterns appear in sequence with no interruptions.
<element name="QName"> p </element>

would then be a shorthand for

<adjacent>
  <range name="QName"> p </range>
</adjacent>

I'll have to think about it.

  • Cool (wap). I'll now have to check out TReX and bone up on composability ... any suggestions of resources I can read?
    • I'm basing my understanding on The Design of RELAX NG, which says something about it. Basically, it's the idea that constructs like <group>, <choice> and <interleave> should work in the same way wherever they're used. (And, as a corollary, that you should use that construct wherever you need that behaviour.) — Jeni 03:48, 10 September 2006 (EDT)
    • Right. (Now I've looked up "composability" too.) And this is just why I wonder: is 'interleave' the same when it's ranges as when it's elements? I think composability might be improved if we have two names for things distinguished by their semantics, which these are. In other words, it's not merely that one form covers elements and the other, ranges: they're really two different ways of constraining ranges, one of which allows overlapping, the other not.
It's as if what Creole does is introduce two new patterns to RNG's: one that allows "merging" of two patterns (overlap, concur or what have you); the other that allows arbitrary mixing of overlapping ranges (one or many of each given type). Neither is quite like RNG's interleave. So composability is maintained if we distinguish them. (As so often I'm kind of thinking aloud. Yet I wonder whether distinguishing the two kinds of interleave proposed wouldn't also make other things easier -- a goal of composability -- such as a short syntax....) --Wendell 18:43, 10 September 2006 (EDT)