HTML Feature Test Entities, P as a container vs. separator

Daniel W. Connolly (connolly@hal.com)
Mon, 11 Apr 1994 14:06:05 -0500


On the Fate of P:

I gather that the general opinion is that HTML document
structure should look like:

t ...

head

p with emphasis in it

Unfortunately, the common way that this is coded is:

<TITLE>t</TITLE>
<H1>head</H1>
p with <em>emphasis</em> in it
<ul>
<li>item 1
<li>item 2
</ul>

The unfortunate part is that there's no DTD (well, none that I can find)
that will enable a conforming SGML parser to infer that structure from
that document. However, if folks are willing to put <P> tags at the
_beginning_ of every paragraph, it can be done.

My current solution is
(1) Docs lacking <P> start tags are supported in a backwards
compatible mode of the DTD, ala:

<!DOCTYPE HTML [
<!ENTITY % HTML.pSeparator "INCLUDE">
<!ENTITY % html PUBLIC "-//connolly hal.com//DTD WWW HTML 1.8//EN">
%html;
]>
<title>backwards compatiblem mode</title>
<H1>header</H1>
para 1
<p>
para 2

in this mode, the text of the paras are content of the BODY element,
and the P elements are empty, ala:

back.. ...

head

para1

para2

(2) In the standard usage of the DTD, paragraphs are containers
and require explicit start tags, ala:

<!DOCTYPE HTML "-//connolly hal.com//DTD WWW HTML 1.8//EN">
<title>backwards compatiblem mode</title>
<H1>header</H1>
<p>para 1
<p>para 2

The parser infers:

back.. ...

head

para1

para2

Here are the current feature test macros:

<![ %HTML.Minimal [
<!ENTITY % HTML.linkRelationships "IGNORE">
<!ENTITY % HTML.linkMethods "IGNORE">
<!ENTITY % HTML.linkRedundantInfo "IGNORE">
<!ENTITY % HTML.forms "IGNORE">
<!-- @@ nested lists -->
<!-- @@ phrases -->
]]>

<![ %HTML.Obsolete [
<!ENTITY % HTML.PLAINTEXT "INCLUDE">
<!ENTITY % HTML.titleCDATA "INCLUDE">
<!ENTITY % HTML.litCDATA "INCLUDE">
<!ENTITY % HTML.NEXTID "INCLUDE">
<!ENTITY % HTML.font-phrase "INCLUDE">
<!ENTITY % HTML.anchorNameCDATA "INCLUDE">
<!ENTITY % HTML.pSeparator "INCLUDE">
]]>

<!ENTITY % HTML.pSeparator "IGNORE"
-- use P element as paragraph separator, rather that container.
This means not all paragraphs need to start with a <P> tag.
-->

<!ENTITY % HTML.linkRelationships "INCLUDE"
-- Adding markup to links to show the relationship between
ends of a link
see http://info.cern.ch/hypertext/WWW/MarkUp/Relationships.html
-->

<!ENTITY % HTML.linkMethods "INCLUDE"
-- Adding markup to links to show the methods supported
by the referent object
see http://info.cern.ch/hypertext/WWW/MarkUp/Elements/A.html
-->

<!ENTITY % HTML.linkRedundantInfo "INCLUDE"
-- Adding markup to links to give redundant information
like URN, content type, title...
-->

<!ENTITY % HTML.anchorNameCDATA "IGNORE"
-- Anchor names should be distinct. SGML parser can validate
this if the NAME attribute of the A element is declared as ID.
But that restricts the syntax of an anchor name to an SGML name,
i.e. a letter followed by letters, numbers, periods and dashes,
up to NAMELEN (34) characters long.
-->

<!ENTITY % HTML.PLAINTEXT "IGNORE"
-- Support for the <PLAINTEXT> tag as a sign of the
end of th HTML data stream and the beginning of a stream
of text/plain data
-->
<!ENTITY % HTML.titleCDATA "IGNORE"
-- Is the TITLE element #PCDATA, RCDATA, or CDATA content?
On Mosaic, it's #PCDATA, but in the linemode browser,
it's more like CDATA, but not quite.
-->

<!ENTITY % HTML.NEXTID "IGNORE"
-- Used by the NeXT implementation to keep track of the
next anchor id to use
-->

<!ENTITY % HTML.font-phrase "IGNORE"
-- allow B, I, TT, U outside PRE,
CITE, VAR, etc. inside PRE
-->

<!ENTITY % HTML.litCDATA "IGNORE"
-- treat XMP, LISTING as CDATA, as per linemodeWWW
-->

<!ENTITY % HTML.forms "INCLUDE"
-- Support for forms as per
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/fill-out-forms/overview.html
-->

If you're interested, see
http://www.hal.com/%7Econnolly/drafts/html-design.html
for background etc., and
http://www.hal.com/%7Econnolly/html-test/html.dtd
http://www.hal.com/%7Econnolly/html-test/html.decl
http://www.hal.com/%7Econnolly/html-test/ISOlat1.sgml
for the DTD itself.

Daniel W. Connolly "We believe in the interconnectedness of all things"
Software Engineer, Hal Software Systems, OLIAS project (512) 834-9962 x5010
<connolly@hal.com> http://www.hal.com/%7Econnolly/index.html