Re: HTML+ Comments

Steve Heaney (Steve.Heaney@delft.sgp.slb.com)
Tue, 20 Jul 1993 18:48:27 +0200


All,

Firstly, many thanks to Dave Ragett (and others) for taking on what must be
an unenviable task. Inevitably HTML+ is expected to be everything to all
people and resolving conflicting requirements to everyones satisfaction is
never easy.

I have a few comments to add to those that have gone before about the HTML+
DTD. Some from my own experiences with writing DTD's. I also had a browse
of the OSF DTD while writing this. The HTML+ version I looked at was dated
24 June 1993.

(The examples may be a bit wobbly - I havn't actually parsed them)

1 Given the simplicity of the current HTML DTD, is it necessary to "ensure
that most existing documents conform to HTML+"? The task of mapping from
HTML to HTML+ should be pretty much the same whether or not HTML is a
subset of HTML+.

2 The paragraph tag as a container. Great. The trouble of having it as
a separator is that it becomes a formatting tag (stick an empty line in
here) rather than semantic markup (the text between the start and end
tags constitute a paragraph). It initially caused me some confusion,
given that different formatters treated it in a different manner.

The same comment (as noted by Klaus Harbo) apply to other EMPTY elements
which logically should have content.

3 Semantic markup using attributes - hmmmm!

I appreciate the logic behind this decision, but I can't help thinking
its a bit of a kluge. It _does_ mean that anybody can invent their own
"tag" and have a conformant document, but does nothing to ensure that it
will be rendered by a given viewer. There will still be the need to
agree on a common set of elements (now transformed to attributes) and we
are back pretty much back to where we were before.

Please, lets have the current list of <emph> types as elements in their
own right.

In the same vein, could I suggest that <p> and <quote> are elements
"without style" and a separate element <note> carries a style attribute
taking one of margin, caution, error (and maybe reviewer).

4 Given 1 above, how about introducing sections as containers rather than
H[1-6] being paragraph type elements - i.e.

<!element section -- ( title?, (%main;)* ) >

Nested sections then imply the level without the need for explicit tags.
Would this be more difficult for clients to parse?

5 Formatting hints. Given the option between a client supporting a wider
range of semantic markup and being about to tweek the format of individual
elements I know where I would put my money :-).

If they are to be included however, it would make some sense to bung
them all into one entity to be included as an attribute for any element
that may require special formatting.

<!ENTITY % inline-format
"font CDATA #IMPLIED
size CDATA #IMPLIED
weight (bold | italic) #IMPLIED" >

<!ENTITY % para-format
"justify (left | centre | right) #IMPLIED" >

Superscript and subscript should be elements in their own right.

6 I cannot recall the reasons why line break and hard space were requested.
Given that they are needed (I have a hard time with line break):

- there is a character entity in the "ISO 8879-1986//ENTITIES Numeric and
Special Graphic//.." representing "no break (required) space". (This
entity set is the one defining lt, gt and amp). This would be more
appropriate than the <sp> element.
- line break should be added as a processing instruction <?line-brk>,
which is exactly what it is.

7 QUOTE is permitted in the DD element without it being declared anywhere.
Why doesn't sgmls complain ?

8 Tables. Dave asks if complex data should be allowed in table fields.
In principle I see no reason why not, but there are other things I'd
rather see implemented before supporting this.

9 The mailto URL. (I know that it is not part of the DTD). Maybe it
could be included as an element <mail> rather than a URL. I think it
would make more sense - it sits awkwardly as a URL.

<!ELEMENT mail - - ( #CDATA ) >
<ATTLIST mail
address CDATA #REQUIRED >

where address contains the fully-qualified Internet address. Other
header fields could be added as attributes or elements.

10 Embedded data. Would it be possible to use the SGML NOTATION construct.
In this way, any SGML conforming renderer would be able to process it
given the capability. E.g.

<!NOTATION PS SYSTEM>
<!NOTATION PDF SYSTEM>
<!ELEMENT EMBED - - CDATA>
<!ATTLIST EMBED
id ID #IMPLIED
notation NOTATION (PS|PDF) #IMPLIED >

(I don't know if the mime types would be valid as notation types - I
can't find any info on whether NOTATION takes a defined set of values).

11 Tables, figures, examples etc. should have a display container. Or am I
missing the point and this is the purpose of <panel> or <fig>?

I was thinking of something like:

<!ELEMENT DISPLAY - - ( title?, (fig | eqn | example | tbl)*, caption?).

11 Comments, marked sections etc.
Given the (relative) complexity of this DTD it is likely that many people
(myself included) will resort to using an SGML editor if given half a
chance. It is important therefore to support or tolerate as much of the
standard as is reasonably achievable. This should include processing
instructions, comments, marked sections etc.

In case you skip read - most of these comments above revolve around ensuring
HTML+ conform to the spirit of SGML:

- markup should wherever possible describe content not format,
- attributes qualify an element, not define its type or content,
- wherever possible physical form should be derived from the markup,
- use what SGML provides,

and some are just nigly points which go to show that I'm a pedantic bugger.

Right, now I'm off to hide for a few days :-)

Steve.

------------------------------------------------------------------------
Steven Heaney

Schlumberger Geco-Prakla
Postbus 148
2600 AC Delft
The Netherlands

Internet: heaney@delft.sgp.slb.com
------------------------------------------------------------------------