Re: The spec evolves...

Dan Connolly (connolly@pixel.convex.com)
Fri, 04 Dec 92 18:07:49 CST


>Is there an SGML reason (apart from a W3 reason) not to also recommend
>that we do a
> <A HREF="ftp://wuarchive.wustl.edu:/graphics/gif/f/fishies"
> CONTENTTYPE="image/gif">
> This is a link to a picture of some fishies.</A>
>where the CONTENTTYPE matches the MIME/IANA registry of same? This
>would be a simple enough way to stick in links to graphics.

There's no SGML reason. The reason I didn't generalize to arbitrary
MIME entities is that the A tag has never had those semantics, and
it would be problematic to introduce them now.

Imagine what would happen if you fed that sample to the current linemode
browser: it would gladly ftp to wuarchive and barf gif data on
your screen.

This is not so much of a problem as long as the referent entity
is some subtype of text/* -- that's the reason for the two-level
hierarchy of mime types in the first place.

I'm trying to keep up with all sorts of HTML ideas. Some things can be
added to html.dtd without significant changes to W3 code (like adding a
BLOCKQUOTE tag for a new paragraph style). But for things that will
require changes to the architecture, I'm developing a separate DTD from
the descriptive html.dtd.

First, I'm suggesting a change in terminology. The representation
of a node, which used to be called a document, and is sometimes
now called a resource (e.g. Universal Resource Locator), should
be called an Entity. This coincides with the SGML and MIME
usage of the term for "a unit of retreival."

Then the term "document" is not used for a unit of retrieval.
The WAIS protocol, for example, allows you to retrieve individual
"chunks" -- paragraphs, lines, etc. The term "entity" is well
suited to these chunks.

In stead, a "document" is a collection of entities that share
some context. This context is what the client uses to translate
relative URL's into absolute URLs. So the document that a node
belongs to consists of all the nodes you can reach from that node
by following only local links (i.e. a maximally-connected subgraph
of the web).

This allows the author to differentiate between links between
nodes of the corpus s/he's writing and links outside to
other works.