> I'd also like to see a "standard" document that a proxy can request
> that will return a list of all documents modified since some date.
> Sites that implement it would see a lot less traffic from proxy
> servers. Sites that don't aren't penalized.
(Incidentally, remote robots would also love such a facility)
This is quite difficult unless you use a local robot; sorting out the
local virtual paths and the virtual URL spaces provided by cgi scripts
is impossible to do just from the configuration files.
> This proposal is very similar to the standard "/robots.txt" document
> that robots/spiders/mirrors/etc. use to behave nicely.
For the Robot Exclusion this is less of a problem, as you generally
shut out entire URL trees, not every individual page in that tree. And
you don't care about modification dates.
> The document could be named "/changes.txt" or maybe
> "/cgi-bin/changes". It would probably be computed on-the-fly from a
> document database. I wouldn't expect anybody to do it with a file
> system traversal, but that is certainly possible.
I doubt this would be efficient on-the-fly, unless you have some mechanism
whereby anybody wanting to change a document has to somehow flag this
pro-actively -- which is a lot of administrative overhead.
If you use an unbounded local robot for web maintenance, then you
might as well have it look at and store "Last-Modified"s in an ls-lR
type file. But I wonder how many people would want to do this, and how
> Assuming all of these optimizations:
> client GETs document, but this is routed via proxy cache
> if expired(cached_document)
> proxy GETs/If-Modified "/cgi-bin/changes" at original server
> if (original_mod_date > cached_mod_date)
> proxy GETs/If-Modified document at original server
> put it in cache
> return cached_document
/cgi-bin/changes would probably faster and more substantially than the
particular documents you're interested in. So you might well end up
negating any positive effect, even without the server-side problems.
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster