Openning the WAIS document-id syntax

Tim Berners-Lee (timbl)
Thu, 26 Mar 92 15:25:12 GMT+0100


> Date: Tue, 24 Mar 92 09:46:21 PST
> From: Jonny Goldman <jonathan@think.com>

Jonny,

This is relevant to the WAIS-FTP work Jim is doing.

Unfortunately none of the WAIS crowd could get to discussions at the IETF -- though
John Curran represented the WAIS side. Those discussions were very interesting.

The data model of WAIS (documents in databases) could be deconstrained to allow
documents themselves to be or contain lists of documents, and for lists of
documents to point to things other than documents in the same database.

This is the way the second part can work. Normally, a search returns a list of
doc-ids, each one (basically) like

/usr/local/lib/wais/mydatabase/fred/myfile.txt

which is in fact a filename. There's a load of other stuff in there which we can
ignore for now. What a WAIS search needs to be able to do, when you are pointing
to files, is to return a pointer to a file in FTP say. We do that in two steps.
First, we recognise that that id is local to the conext of a wais server on host
myhost and port myport. When the server returns that string, the client
uses knowledge of the context in which it was quoted to exapnd that to

wais://myhost.dom.net:myport/usr/local/lib/wais/mydatabase/fred/myfile.txt

This is a refernece you can quote to anyone as it makes sense anywhere. No context.
I called it a UDI but we'll have to change the name. Document Access Token maybe.
It's like Brewster's proposal but extendable to other protocols. [Yes, WAIS is a
good protocol but there are others. Including name servers and directories which
will be needed for long-lived but movable documents.]

Now suppose one day a server returns a doc-id INCLUDING the protocol, host, etc.
For example, your WAIS FTP engine (like the ARCHIE WAIS) returns what are basically
pointers to files. Just now, because of the constraints of the model, it has to
return a part of a file within the database. Suppose we change that, so that
in your case it just returns a doc-id which specifies anonymous ftp access, like:

file://otherhost.com/pub/doc/mydoc.txt

The client has a general retrieval engine which can accept doc-ids in many domains
-- not just WAIS. That allows it to go out over a different protocol to retrieve
the object.

This is the way WWW and Gopher work. They are open systems -- you can link into
any other system within reason. That's why the fuss about universal document
identifiers. Maybe the WAIS people would to incorporate them -- that is, just
make sure that the normal WAIS server return things which are -- like the one
above -- special cases of the more general syntax.

I haven't had much comment from the WAIS side about the UDIs, but I'd like to have
some. (file://info.cern.ch/pub/www/doc/udi1.ps was background for the IETF
discussions.) We plan a small working group hacking out the details before an RFC
is submitted.

> I like the idea of generalized interfaces, customized servers.

You bet!

- Tim BL