Re: Draft: Universal Document Identifiers

Tim Berners-Lee (timbl)
Thu, 5 Mar 92 15:25:08 GMT+0100


[Admin: If anyone is missing documents from this discussion which I
have, they are all in a mailbox
file://info.cern.ch/pub/www/doc/udi/discussion.mbox. Some of the
messages were sent to only some of the lists. Also, I mis-spelled
the name of cni-arch.uccvma in my original posting, so some replies
have not gone there. I will not repost them. The orginal udi paper
is slightly updated now. Same UDI -- no versioning ;-)]

Now, about these USDNs:

> Date: Thu, 5 Mar 92 07:32:50 EST
> From: ses@cmns.think.com

There have been several messages now with a common theme: That what I
called in the udi1 paper a "lasting registered name" is better than
an "address".

Peter Deutsch argues the point at length in
<9203042206.AA12411@expresso.cc.mcgill.ca>, using the term USDN by
analogy with ISBN.

John Curran on <Thu, 27 Feb 92 19:45:42 -0500> argues the same, and
also suggests quoting both registered name and address (which I
wasn't so sure about in case they get out of sync).

I completely agree with Peter and Simon's point of view, and I have
modified the paper to put more emphasis on this. What I obvioulsy
didn't make clear enough is my feeling that:-

1.There may be more than one USDN scheme, just as there are many
physical addres schemes.

2. There may be more than two stages: it is an oversimplifiaction to
talk of only a USDN and an address: For example, an ISO standard may
dereference (or as Ed says, "swizzle") to a document produced by the
IETF which may dereference down to a prospero name which may be a
pointer to an FTP file.

3. We can't use USDNs now because they aren't there. We need a
transision strategy.

Therefore, UDis were supposed to be able to hold _either_ a USDN _or_
a physical address. They weren't intended to get involved with the
discussion of which USDN/ISBN/ISSN/ISDN (?!) scheme is better. So, I
say, by all means define an USDN scheme, then register it as a
possible UDI. If is good and everybody uses it, everything will end
up with a USDN, and the context will always be USDN documents, so the
usdn: prefix (or whatever) will not in practice be used. I'm all for
the market deciding between protocols.

Simon:

> I'm strongly in favour of the two stage lookup process; X.500 is
obvious
> technology, although it is rather heavyweight for personal
computers. An

> alternative might be some sort of DNS/archie-like service. These
could return
> Tim's UDIs, which could then deliver the good themselves.

I would say "a server takes x500 UDIs and returns physical UDIs which
deleiver the goods themselves.", meaning the same thing. (I would
allow it the option of delivering a set of addresses, not just one.)
Yes, x500 is heavyweight so one can have a lighter protocol which
accesses a real x500 engine via a gateway with a large cache.

> Of course, invdidual information sources should still use local
document

> numbers where possible, but should provide a way of mapping from
local-id
> to universal-id when needed.

Yes.

> One little question: What should be done about document versions?
> Obviously, different versions of a document should have different
> UDSNs, but should there be a simple way to compare USDNs modulo
> versions?

Good point. What about versions which split? A great spin-off of
having versions available is that you can refer to a line number in
them. A line number in a document which is not frozen is useless.
[This solves a recurring problem in hypertext systems, when one wants
to link to part of a document to which one has no write access, and
which may change].

> Here are some suggestions.. Eat hot ASN, Cultural Cringer.
> [...]

We must be careful not to reinvent the wheel: if the USDN problem is
the same as the phone book problem (which it seems to be) then we
should pick up on x500.

An important thing about x.500 is that it was designed to scale (I
hope!). By contrast as Ed says:

| Date: Wed, 04 Mar 92 23:52:05 -0500
| From: Edward Vielmetti <emv@msen.com>
| [...]
| ISBN is hierarchical so you can stamp out your own
| unique ID's; ISSN (international standard serial number) has
| a central cataloging authority.

and i doubt whether either of those will scale to allow document
publishing on the net by every kindergarten child etc etc twice a
minute. That's why I assume x500 is best in theory at least. But tell
me I'm wrong.

Ed also mentions message-ids which are after all unique. The trouble
is, there's no way of looking up where to find them.

Tim

__________________________________________________________
Tim Berners-Lee timbl@info.cern.ch
World Wide Web initiative (NeXTMail is ok)
CERN Tel: +41(22)767 3755
1211 Geneva 23, Switzerland Fax: +41(22)767 7155