Re: SGML newline processing

Dan Connolly (connolly@pixel.convex.com)
Fri, 08 Jan 93 14:52:09 CST


>SHORTTAG is an optional feature, but SHORTREF is not, since
>it is required in the SGML declaration. I think, according
>to the standard, a system which does not support SHORTREF
>is not compliant and therefore not even minimum SGML.

Hmm... I've got the standard in my lap, and while it usually
takes me at least 1/2hour to be sure I've reall all the
relavent sections, it appears to agree with your statement.

However, we're only interested in parsing instances of
a particluar DTD.

If we make no SHORTREF declarations in this DTD, we can
dispense with shortref processing in our parser.

>My solution only requires SHORTREF. I code:
>
><!ELEMENT newline - o EMPTY>
><!ENTITY nltag STARTTAG "newline">
><!SHORTREF nlmap "&#RS;" nltag>
><!USEMAP nlmap (verbatim)>
>
>The use of OMITTAG in the newline element is not
>necessary. This code causes the parser to recognize
>record starts as newline tas within verbatim tags.
>My processor converts the newline tags back to record
>starts.

Hmmm... this is interesting. First a question: why
the newline element in the first place? why not just
make the shortref expand to a newline character in
the first place?

If we put declarations like this in the HTML DTD, it
would then be legal to treat

<XMP>
foo
</XMP>

different from <XMP>foo</XMP>

Thanks for the idea... it just might work!

Dan