URLs for trees.

hallam@dxal18.cern.ch
Thu, 15 Sep 94 16:07:06 +0200


>It is rather unclear to me is what you mean by "in the directory
>tree", especially in the context of "all protocols". What are the
>semantics when "/hallam" is not a simple directory, but dynamically
>generated by a script or so? Do you fail, or do you run a local robot
>and pass it all the HTTP headers you got from the client? The latter
>would be neat, but an enouremous overhead.

If /hallam is not a simple directory then the script can chose to do
what it likes. For smoe scripts it actually makes sense to allow such urls.
For example in a mail server there is an implicit hierarchy. I don't see this
as a problem since there are many files that the recursive descent would
not be appropriate for - ie any that is not a directory.

I don't want extra methods. They cannot be used without a new browser. I
can ask the M-object to get url /hallam/* and it will do it.

There is a problem with the /** convention, how does it work with

/*.html is all the files with a .html extension

/**.html is what ?

/m*.html is all files matching the pattern
/**m*.html is what?

or should we have /*/m*.html ? This seems to work better.

So this would mean changing the uri spec to state that /*/ indicated a
recursive descent of directories.

This sort of stuff is actually there at the moment in the configuration
side of the server.

>Yeah, like robots and mirrors -- will people really allow this on
>their servers, even if you recursively check modification dates
>against an If-modified-since? I certainly won't without out of band
>bilatreal agreement with clients.

Absolutely! I was thinking it would be neat to use it to synchronise the files
between all the development platforms here. NFS, and RCP have severe
limitations.

It would not be the sort of thing that you would want to have everyone doing
but we have enough security schemes, access controls etc. We can always
develop some sort of ACL.

I would not want to allow a robot to do a GET on the whole filestore but I
might allow a HEAD or some sort of analysis to be done. This might well be
cached. This sort of thing would mean that a robot could get all the information
it wants in a single get and not need to do multiple gets.

So the robot would ask :-

GET /*/*?analyse http/1.1
accept: multipart/mixed

and get back

209 You are pushing it mate but OK just this once.
Content-type: multipart/mixed
Content-Length: 69 (its long mate)

<list of urcs for all the urls you care to name>

On the other hand if the robot asks

GET /*/* http/1.1
accept: multipart/mixed

the response is :

502 Insufficient resources. What do you think I am? your personal slave?

Are we making progress here?

Phill.