Re: Caching Servers Considered Harmful (was: Re: Finger URL)

Steven D. Majewski (sdm7g@elvis.med.virginia.edu)
Mon, 22 Aug 1994 16:17:03 -0400


On Aug 22, 18:08, "Rob Raisch, The Internet Company" wrote:
>
> Because anyone running a caching server runs the dual risk of presenting
> out-of-date information to their users and can be in direct violation of
> international copyright law.
>
> The first point is by far the most important in my mind. As more and
> more professional publishers come online, you will see this becoming
> much more of an issue.
>
> [ ... various remarks about timliness of information ... ]

But, isn't that what (HTTP) Expires: is for?
If the information may change over time, then it should be marked so.

There are some documents that the provider know *WILL* be superceded,
although he doesn't know exactly when. It would seem that a reasonable
procedure would be to set Expires: to now + 1 unit, and that the client
or caching server should use "If-Modified-Since:" to check if it
has *actually* expired. ( Or does having no explicit Expires: imply
this ? Whichever, there should probably be a usage note somewhere to
document the proper procedure. )

> Of course, I can mark my information as being uncacheable, but will you
> honor that request? Your interest is to provide content to your users
> with as little impact on your communications resources as possible. I
> believe that your goals and mine are not compatible.

I think both ends should have a compatible goal: to follow a common
protocol. Is the problem here:
(1) That a new feature/header-field needs to be added to the protocol.
( i.e. is Expires: being forced to carry too much of a "semantic load"?)
(2) That a usage clarification about how to properly USE the protocol
needs to be added. ( Semantics of Expires: needs to be defined. )
(3) That there are just some broken or misconfigured servers out there.
? I have listed those in what I think are increasing probability, but
I would consider any of them more reasonable conclusions than that
caching servers should be considered harmful.

> The copyright issue is the more difficult one. In light of the previous
> argument, you are archiving an original work. This is called "copying"
> in copyright law and if it is done without permission, is against the law.

The data is likely to get "copied" numerous times in transit from the
provider to the client. ( And probably cached on the client - and what
if my client provides cross-session global caching ? )
The only technological fix is for the copyrighted data to be encrypted
and viewable only by the authorized client/customer. ( i.e. cached and
encrypted data is useless for another client with a different key.
This could be an argument for (#1) above. Trying to overload too many
functions onto Expires may make erroneous results ambiguous. )

> (I'm ignoring any arguments that copyright law must be redesigned in light
> of digital distribution. I don't think anyone would disagree with this.
> However, I doubt that copyright is going away and in fact, I expect the
> body of law will be strengthened not diluted.)

I would take this, and the implication of legal culpability on the part
of the server for copyright violation, as an argument for specifying
"reasonable behaviour" or semantics more strongly: to distinguish that
the server (and it's maintainer) can't be responsible for things that
the client doesn't tell it! True - RFC's carry no legal weight in any
courts I know of, but specification in an internet standard, plus a
couple of expert witnesses to testify on what the "community" considers
to be prudent behaviour might just make the difference in a court
deciding on what exactly constitutes negligent behaviour.


> I expect that most professional publishers will not serve content to any
> site which caches unless they can enter into a business relationship with
> that site. Unfortunately, this presents a very interesting N by N
> problem, as publishers and caching servers proliferate.

I don't think the WWW is "ready for prime time" commercial use yet -
better authentication, security, encryption, etc. needs to be
standardized, implemented and deployed ( i.e. in *common* use )
first. But I think you are wrong in picking caching servers as the
scapegoat that would prevent it. I think, rather, that they are going
to be a useful and (practically) necessary piece of technology to
bring the Web to the (commercial) masses.

-- Steve Majewski (804-982-0831) <sdm7g@Virginia.EDU> --
-- UVA Department of Molecular Physiology and Biological Physics --
-- Box 449 Health Science Center Charlottesville,VA 22908 --
[ "Cognitive Science is where Philosophy goes when it dies ...
if it hasn't been good!" - Jerry Fodor ]