Re: SGML wierdness #743 [Was: Toward Closure on HTML ]

Matt Timmermans/MSL (Matt_Timmermans@msl.isis.org)
Fri, 8 Apr 1994 05:41:21 -0400


[Matt Timmermans]
| Given these conditions, if the DTD were to be rewritten to use
| 'p' elements as containers instead of divisions, it would be
| possible to specify BOTH the start and end tags as omissible.
| When parsing current HTML documents with the new DTD, an SGML
| parser would usually infer the initial <p>, and infer a </p>
| before each explicit <p>.

[Daniel W. Connolly]
| Try it. It just doesn't work that way. An SGML parser can only
| infer required start tags.

It doesn't work with the DTD you provided, but my reasoning works like like
this: If <p> tags divide a body into paragraphs, then a body must contain at
least one paragraph (no P tags), and can consist of only paragraphs. With this
content model, it works just fine:

<!DOCTYPE TEST [
<!ELEMENT TEST O O (HEAD, BODY)>
<!ELEMENT HEAD O O (TITLE)>
<!ELEMENT TITLE - - (#PCDATA)>
<!ELEMENT BODY O O (P+)>
<!ELEMENT H1 - - (#PCDATA)>
<!ELEMENT P O O (H1|#PCDATA)*>
]>
<TITLE>testing</TITLE>
<H1>Here we go...</H1>
This is no problem now.
<p>This would be fine
<p>And so would this.

The constraints on the content model I have in my last article were too loose
(oops).

The DTD Dan gave implied that some things in HTML besides <p> tags can start
and end paragraphs. If this is a requirement, then it does cause problems.

</Matt>

Matt Timmermans | Phone: +1 613 727-5696
Microstar Software Ltd. | Fax: +1 613 727-9491
34 Colonnade Rd. North | BBS: +1 613 727-5272
Nepean Ontario CANADA K2E-7J6 | E-mail: mtimmerm@msl.isis.org