Re: Toward Closure on HTML

Chris Lilley, Computer Graphics Unit (lilley@v5.cgu.mcc.ac.uk)
Wed, 6 Apr 1994 18:51:18 GMT


letovsky-stan@CS.YALE.EDU wrote:

> "Daniel W. Connolly" <connolly@hal.com> wrote

>>NEWLINES, PARAGRAPH BREAKS, AND <P>

>>Folks have asked why the <p> tag is necessary at all -- why can't we just
>>use a blank line like troff and TeX?

If this was a small project just starting out, that would be a valid suggestion
(but a poor one; a paragraph should hava a tag just like anything else).

Given however that the Web is in daily use by millions, such a suggestion is
well off the mark.

Yes, there is a problem in that <p> has no </p> and is at the end of a
paragraph. OK, its broken but the browsers handle it.

They also handle a form which fits in with the way all the other tags work:

<p>this is a paragraph.</p>

and this form is likely to be in html+

>>First, it's too late to do that: there are too many documents with blank
>>lines that don't indicate a paragraph break.

Absolutely.

>HTML is not yet at the point where it should be regarded
>as cast in stone. An incompatible would-be successor HTML+
>is already on the horizon. Now is the time to consider
>such changes.

This is confusing several issues. Yes, HTML should be frozen as a standard
rather than continually being twiddled with. Hence HTML+; what was learned from
HTML has been fed into it.

But altering HTML at this late stage to be even less SGML compliant would be
such a bad move I am amazed that anyone has suggested it.

Incompatibility is a non-issue; I suspect a browser can tell the difference
between text/html and text/htmlplus.

Now is indeed the time to consider such changes; the defect has been noted and
corrected; the HTML+ DTD specifies <p> ... </p>. Simple, consitent, SGML
compliant. End of problem.

>>Second, not everybody wants it that way: I'd like to be free to stick blank
>>lines in lists and such without introducing paragram breaks.

Indeed.

> This is a non-issue. LaTeX has a perfectly reasonable approach to
> ignoring extra paragraph breaks in list contexts; use that.

Why?

And why just lists?

And why LaTeX of all the awful things to pick as an example. If you are happy
writing in LaTeX, do that - and use the Leeds converter to make it into HTML.
But do not suggest that HTML be altered in wierd and un-SGML-like ways to fit in
with what you happen to already find familiar. Not everyone uses LaTeX; and as
the Web grows outside the research/academic community, the percentage of LaTeX
users will fall drastically.

>Third, the mechanism for expressing this in SGML, SHORTREF, introduces
>significant complexity to parsing HTML. It opens up a canof worms including
><em/foo/ and other tricky parsing idioms.

I think this is just saying that the suggested "empty line means a closing
paragraph tag really" method is just not SGML and would need unsightly hacks to
express in in a DTD. I agree.

>In other words, you would rather have a language that is convenient
>to parse than one that is convenient to use. Big mistake.

You think that having a blank line mean something in one place and not mean it
in other places (lists, whrere else) is 'convenient to use'. Come to that, you
think that anything connected with LaTeX is convenient to use? An even bigger
mistake.

Count the number of people using LaTeX in 'the real world' compared to the
number using GUI wordprocessors.

>The
><p> ... </p> construct is a big step in the wrong direction: it makes
>a simple construct like a paragraph, which was already well handled
>by a text-editor, into something onerous

No, it makes it something which is consistent with the way all the other tags
work and is therefore easier to use and understand if you are typing in html by
hand - which is not the only way to do it. How 'onerous' is typing </P> ??

You will be suggesting next that titles are well handled by a text editor

This is a title
---------------

so that should be in HTML too? ;-)

>no one but a parser-writer
>would view <p> ... </p> as an elegant way to say "this is a
>paragraph. Similarly for <li>, etc.

So you want to do away with <li> too? How does LaTeX do that then?
If people understand that <h1>Title</h1> is a title, understanding that
<p>paragraph</p> is a paragraph does not seem too great a leap.

I think this thread has conflated several distinct points:

1) Current usage

Current use of the <p> tag as a separator is anomalous. This has been sorted in
HTML+

2) Ease of parsing

Having a consistent, SGML compliant syntax makes documents easy to parse and
generate automatically; altering the spec to allow non-SGML-like forms would be
a backward step.

3) Ease of use

People find easiest using what they know already. People used to LaTeX naturally
find the forms of that mark-up more natural. People using other things will find
those things more natural. The solution to this is not however to turn HTML into
LaTeX - which which would only suit one subgroup of users - but to author using
a system you are familiar with and convert or export as HTML. Solutions already
exist for LaTeX, FrameMaker, Microsoft Word and WordPerfect. There is a project
to develop a Motif GUI html editor. Ease of use is addressed by better
HTML-producing tools, not by convenience hacks to HTML.

Chris Lilley
+-----------------------------------------------------------------------------+
| Technical Author, ITTI Computer Graphics and Visualisation Training Project |
+-----------------------------------------------------------------------------+
| Computer Graphics Unit, | Internet: C.C.Lilley@mcc.ac.uk |
| Manchester Computing Centre, | Janet: C.C.Lilley@uk.ac.mcc |
| Oxford Road, | Voice: +44 61 275 6045 |
| Manchester, UK. M13 9PL | Fax: +44 61 275 6040 |
| <A HREF="http://info.mcc.ac.uk/CGU/staff/lilley/lilley.html">click here</A> |
+-----------------------------------------------------------------------------+