Re: Documet ids (was Archie, WWW access directly to files)

Tim Berners-Lee (timbl)
Wed, 19 Feb 92 10:41:35 GMT+0100

Date: Tue, 18 Feb 92 23:39:44 -0500
From: Edward Vielmetti <>

> Throughout the years people have used different ways to

> describe files available for anonymous ftp. that has never
> been standardized ever. no reason to believe it ever will
> be.
Is there a similey for a big sigh?...

> If we are going to come up with a special kind of document
> that refers to a new document type that has the semantics
> of "pointer to file available for anonymous ftp", then it
> should be assigned a WAIS type tag, described, and specified.
> I'd suggest the tag AFTP. Someone write a spec, we'll all
> write code, & be done with it. (There's plenty of data after
> all.) It would be better to do that rather than to use a TEXT
> type tag and bicker about the format.

This Archie-wais-www has to get around the fact that the doc-id in
the search response is not the doc-id of the file, it's the id of a
line in the site listing which refers to the file. However, one
wants to jump straight to the file, rather than to the site listing.
For this reason, the gateway throws away the wais doc-id and
generates an id for the file itself from the headline. If the doc-id
itself was that of the file (in any format), that would be cleaner of
course, as the headline could be in any human readable format. [Would
that be easy, Kean?]

> I don't think it would be hard for the WWW gateway to WAIS to
> do special things to documents if they had a different type,
> and then use that to convert AFTP type documents to WWW format.
> Ditto gopher, archie, etc. clients.

It would be possible, sure. Do we want to have to access an AFTP type
document just to get a pointer to an FTP site? This takes time, I'd
prefer top skip that step.

> If it's TEXT, on the other hand, it can be *anything*. Please
> don't overload the semantics of the name of the server or the

> accidental formatting of the contents of the document. I would
> like to create AFTP records to stick into many servers.

I agree that overloading the database name is horrible! Its a hack to
show what is possible. You can only do it cleanly if you has
universal document ids of some form or other.

Sure, clients and gateways can convert UDI formats -- avoids the
bickering but not as cool as having a common format. (Need that
smiley again!)

[BTW, If you're going to have an AFTP file format for pointing to
aftp sites, will you also need a GOPH file format for pointing to
gopher sites, and a NEWS file format for pointing to newsgroups...?
Suppose you do have some universal id scheme. Then you could have one
format for a file of pointers. Using the SES filter system, indexing
that file could (if it looked like a README for example) retrieve the
referenced document and index the actual document rather than just
the name.]


Tim Berners-Lee
World Wide Web initiative (NeXTMail is ok)
CERN Tel: +41(22)767 3755
1211 Geneva 23, Switzerland Fax: +41(22)767 7155