Re: Adding new tags (was: Redefining...)

Joe English (jenglish@crl.com)
Mon, 13 Jun 1994 17:29:32 -0700


Daniel W. Connolly <connolly@hal.com> wrote:
> In message <199406131903.AA15666@crl.crl.com>, Joe English writes:
> >If HTML+ allows new elements to be declared in the
> >internal DTD subset, then browsers will pretty much
> >have to incorporate a full SGML parser.
>
> Not so. If you let documents say:
> <!ENTITY % cextra "| quark | lepton">
> but you keep the element declarations in the pre-compiled
> part of the DTD, e.g.
[ ... content model for %cextra; fixed in external DTD subset ... ]
> then you can introduce this "hook" processing without complicating
> the browsers' parsers terribly.

This is true: as long as the browser only needs to
recognize a predetermined set of content models, it
does't need to do full a DTD analysis.

[...]

> The way you've marked up your example is interesting though... it
> switches the tags and the attributes.
>
> ><p role=imho>
> >I like the name <em role=attname>ROLE</em> [...]
> </p>
> Using architectural forms the way HyTime uses them, this
> would be:
> <imho role=p>
> I like the name <attname role=em>ROLE</attname> [...]
> </imho>
>
> And actually, none of the role=... attributes would show up in
> the markup of the instance -- they'd be FIXED attributes in
> the DTD.

Ah! I'm looking at it another way.

In the scheme I'm thinking of, HTML elements
*are* the architectural forms; an SGML document
conforming to the HTML architecture would be
presented to browsers *as an HTML document*.

Users could prepare documents in whatever DTD they want,
like in your example:

<imho>I like <attname/role/ blah blah blah</imho>

Elements in the source DTD would be linked to
HTML architectural forms (i.e., elements in the HTML
DTD) via an LPD or by #FIXED attributes, like you said:

<!ATTLIST imho
HTML NAME #FIXED p
>
<!ATTLIST attname
HTML NAME #FIXED em
>

Then (the way I see it), the source document gets converted
to HTML *before being sent to the browser*:

<P>I like <EM>role</EM> blah blah blah</P>

This loses information, though: the semantic roles
"imho" and "attname" have been thrown out.
That's what the "role" attribute is for:

<P role=imho>I like <EM role=attname>role</EM> blah blah blah</P>

(The attribute value wouldn't have to come
from the source element name, of course; it could
be specified in an LPD as well.)

Browsers can then use the semantic role to make formatting
decisions, taking hints from a style sheet or from <RENDER>
tags in the head.

Many people won't need a special-purpose DTD at all,
and can just use HTML. The "role" attribute
would still be availiable to encode their own semantic
data and style specifications.

[ earlier ]

> The way you've done it, you
> could never define your own content models.

No, not in HTML. You can't define them with
the %cextra; hook, either. That's exactly the point: so
browsers don't have to understand new content models.

However, you can define whatever models you
like in the source DTD, as long as the mapping to
HTML yields a valid document. (Which should generally
be the case; most DTDs will be more restrictive than HTML.)

[...]

> So the "role" attribute
> acts like a "style" attribute mostly.

Not entirely; it could also be used in queries
and for other purposes where semantic roles are useful
(e.g., "locate all the EM elements with an ATTNAME role").
But yes, the most common use would probably be to specify
style attributes.

--Joe English

jenglish@crl.com