Re: Toward Closure on HTML

William M. Perry (
Tue, 5 Apr 94 07:46 EDT

Stan Letovsky writes:
>Dan Connoly writes:
>>Third, the mechanism for expressing this in SGML, SHORTREF, introduces
>>significant complexity to parsing HTML. It opens up a canof worms including
>><em/foo/ and other tricky parsing idioms.
>>But I would like to introduce one change to the way P elements work: I'd
>>like to make the P element a paragraph container rather than a paragraph
>>separator. The only required change is to put a <p> tag at the beginning of
>>every paragraph -- we can use the OMITTAG feature to make </p> tags
>>implicit. It makes for a much cleaner DTD in many ways, and it just makes
>>more sense.
>In other words, you would rather have a language that is convenient to
>parse than one that is convenient to use. Big mistake. The <p> ... </p>
>construct is a big step in the wrong direction: it makes a simple
>construct like a paragraph, which was already well handled by a
>text-editor, into something onerous. It is not clear that your ommittag
>fix is backward compatible, since it suggests <p>'s at the front of
>paragraphs instead of at the back; if an incompatible change like this is
>possible, then why not double-\n?

The great thing about the HTML+ DTD is ... get ready... it is actually
parseable by commercial (and free) SGML editors/validators. Unlike the old
html spec. So, hopefully more and more people will start using these
tools, and they won't have to worry about the 'oh-so-inconvenient' tags.

>But the bigger issue is that you are ignoring user preferences in favor of
>anal-retentive coder preferences: no one but a parser-writer would view
><p> ... </p> as an elegant way to say "this is a paragraph. Similarly for
><li>, etc.

How's this for a proposal:

1. Get rid of <p> tag and replace it with double-\n
2. Get rid of <li> tag and replace it with \t*
3. Get rid of <dd> tag and replace it with \n
4. Get rid of <dt> tag and replace it with \t*
5. etc
6. etc

That would increase the readability of the text immensely.