HTML should *NOT* HTTP

Nick Williams (njw@cs.city.ac.uk)
Tue, 31 May 1994 15:39:03 +0100 (BST)


>From the recent WWW conference it became evident that many people had a
confusion between HTML+ and HTTP. Specifically, there was a large body
of opinion towards placing type information into HTML. For example,
placing within the HTML+ document information such as "here's a picture,
and the alternative text for the image is blah blah blah". e.g.
<FIG src=/some-image>
If you don't have image viewing capability, you'll see this text
</FIG>

The belief is that clients which cannot display images (or audio, or
whatever) should be able to display something. IMHO, this model this
seems to be completely incorrect, as it merely duplicates (badly) the
format negotiation protocol already present in HTTP.

It was even suggested that many different types of alternatives to be
inlined within a document (e.g. <alt type=audio/basic>DATA</alt><alt
type=text/plain>Hello world</alt>), which could lead to huge data files
being downloaded, when the user only EVER wants to see one of the types
of data object. The best comparison here is between audio and images.
A visually impaired client will typically only want the audio
descriptions of images and if it has to download the complete image
every time it asks for a soundbite, it will double (at least) the amount
of data the client needs to download.

When you request a URL from a server, you transmit information
specifying exactly what data formats you understand. If you have a line
mode browser, it will not request objects of type image/gif or whatever.
This allows the server to find the best "fit" to the requirements and
return that object and only that object. This also means that the type
of object returned may very well NOT be the same type as determined by
heuristics on the URL extension.

A case in point: we have several maps on our Web site which are
"clickable"/"ismap'd", allowing users to select objects within the
image. When viewed from a graphical client, this works as expected.
When viewed from a text only client, a menu is given showing textual
descriptions of the objects within the image. It works because of the
usage of different methods in HTTP, and because the client will not ask
for an inlined image. If the inline image tag was corrected to be a
generic inline include which could cope with different media types, then
this system would work with even more ease. This is a simple system to
maintain, as the ismap data and the menu data come from the same source,
on the fly.

This leads to one of the valid points in the argument for placing
alternative data within an object: maintenance. If the objects are
always brought in from the server then it means to some extent that each
alternative must be in a seperate object, typed individually. This
could lead to an unhappy webmaster looking after the web, maintaining
many files per document. This may be true, but it is a problem to be
addressed by making a better scheme for editing and serving objects. It
should not be addressed by placing all of the data into the same object
with conditional display clauses: the only way that works is if the user
edits the objects using raw HTML, with "vi" or similar. This is not the
way forward; editors are appearing and will appear with greater
facility for looking after a true hypertext document (rather than merely
understanding HTML) in the future. This is the area in which
alternative object types should be examined.

Nick Williams, Systems Architecture Research Centre, City University,
London, EC1V 0HB. UK.

Web: http://web.cs.city.ac.uk/finger?njw
E-mail: njw@cs.city.ac.uk (MIME and ATK)
Work Telephone: +44 71 477 8551
Work Fax: +44 71 477 8587