Re: Toward Closure on HTML

Daniel W. Connolly (connolly@hal.com)
Thu, 07 Apr 1994 13:29:08 -0500


In message <Pine.3.85.9404071040.A8054-0100000@hmmm>, "Rob Raisch, The Internet Company" writes:
>
>Dave, that may be correct in principle, but in real life it opens a
>rather nasty can of worms. (Re: <p>Text -- infering or assuming the
>missing </p> endtag)

Inferring </p> tags would be is easy. Well... at least the SGML standard says
how to do it in a way that's consistent with current practice in HTML.

It's the start tags (<p>) that cause trouble.

>When we get to a point where we support stylesheets (PLEASE!) it is of
>extreme importance to consider <p></p> a container. Without this, it is
>not possible to assign stylistic attributes to a contained element.

Counter-argument: The MidasWWW browser had a really nifty stylesheet-based
hypertext widget set, and it grokked empty P elements just fine. Something
like:
*HTML*BODY.font: ...
*HTML*BODY*P.breakBefore: True
*HTML*BODY*P.breakAfter: True

>Current practice suggests that <p> is not a container at all, it is a
>logical break -- or it is considered as a container with no contents. This
>is the behavior of available browsers, as I understand them.

Agreed.

>---------------------Example---------------------
><body>
>This is text with no container. (1)
><p>
>Perhaps this is text in a <p> container. (2)
><p>
>Hmmm... no </p> associated with the previous <p>! Do we assume that there
>was to be one, or do we treat <p> as a break? (3)
></body>
>-------------------------------------------------
>
>The principles behind SGML -- and by its lineage, HTML -- are to markup
>the structure of the document.

>In the previous example, what is the text associated with (1)? It is
><body> text or <p> text?

It is straightforward to construct DTD's where (1) is content of
the BODY element. The draft-iiir-html-01 version of the html DTD
did this. My recent html version 1.7.2.4 also does this.

I think it is impossible to construct a DTD where (1) is the
content of a P element without doing stuff like "The first
element of a BODY element must be a P."

> And if we build stylesheets which allow logical
>elements within the document to have their own stylistic "hints", which do
>we apply to (1)?

Body.

The declarations
<!ELEMENT BODY O O (#PCDATA|P|OL|UL|DL|H1...)>
<!ELEMENT P - O EMPTY>
are consistent with current practice.
I have considerable evidence to back that claim.

Parsing extant documents relative to delcarations like
<!ELEMENT BODY O O (P|UL|OL|...) -- no #PCDATA -->
<!ELEMENT P - O (%htext)+>
results in errors.

If there is sufficient motivation to change all the documents out
there to move #PCDATA out of BODY and into a subordinate paragraph
element (which I agree is a good idea), why not call give that
element a new name like PP while we're at it?

Dan