On the Fate of P:

I gather that the general opinion is that HTML document
structure should look like:

t ...


p with emphasis in it

Unfortunately, the common way that this is coded is:

p with <em>emphasis</em> in it
<li>item 1
<li>item 2

The unfortunate part is that there's no DTD (well, none that I can find)
that will enable a conforming SGML parser to infer that structure from
that document. However, if folks are willing to put <P> tags at the
_beginning_ of every paragraph, it can be done.

My current solution is
(1) Docs lacking <P> start tags are supported in a backwards
compatible mode of the DTD, ala:

<!ENTITY % HTML.pSeparator "INCLUDE">
<!ENTITY % html PUBLIC "-//connolly WWW HTML 1.8//EN">
<title>backwards compatiblem mode</title>
para 1
para 2

in this mode, the text of the paras are content of the BODY element,
and the P elements are empty, ala:

back.. ...




(2) In the standard usage of the DTD, paragraphs are containers
and require explicit start tags, ala:

<!DOCTYPE HTML "-//connolly WWW HTML 1.8//EN">
<title>backwards compatiblem mode</title>
<p>para 1
<p>para 2

The parser infers:

back.. ...




Here are the current feature test macros:

<![ %HTML.Minimal [
<!ENTITY % HTML.linkRelationships "IGNORE">
<!ENTITY % HTML.linkMethods "IGNORE">
<!ENTITY % HTML.linkRedundantInfo "IGNORE">
<!-- @@ nested lists -->
<!-- @@ phrases -->

<![ %HTML.Obsolete [
<!ENTITY % HTML.font-phrase "INCLUDE">
<!ENTITY % HTML.pSeparator "INCLUDE">

<!ENTITY % HTML.pSeparator "IGNORE"
-- use P element as paragraph separator, rather that container.
This means not all paragraphs need to start with a <P> tag.

<!ENTITY % HTML.linkRelationships "INCLUDE"
-- Adding markup to links to show the relationship between
ends of a link

<!ENTITY % HTML.linkMethods "INCLUDE"
-- Adding markup to links to show the methods supported
by the referent object

<!ENTITY % HTML.linkRedundantInfo "INCLUDE"
-- Adding markup to links to give redundant information
like URN, content type, title...

-- Anchor names should be distinct. SGML parser can validate
this if the NAME attribute of the A element is declared as ID.
But that restricts the syntax of an anchor name to an SGML name,
i.e. a letter followed by letters, numbers, periods and dashes,
up to NAMELEN (34) characters long.

-- Support for the <PLAINTEXT> tag as a sign of the
end of th HTML data stream and the beginning of a stream
of text/plain data
-- Is the TITLE element #PCDATA, RCDATA, or CDATA content?
On Mosaic, it's #PCDATA, but in the linemode browser,
it's more like CDATA, but not quite.

-- Used by the NeXT implementation to keep track of the
next anchor id to use

<!ENTITY % HTML.font-phrase "IGNORE"
-- allow B, I, TT, U outside PRE,
CITE, VAR, etc. inside PRE

-- treat XMP, LISTING as CDATA, as per linemodeWWW

-- Support for forms as per

If you're interested, see
for background etc., and
for the DTD itself.

