Client Caching -- Was: Re: Network Abuse by Netscape?

John Kilburg (john@cephas.ISRI.UNLV.EDU)
Sun, 30 Oct 1994 19:49:15 -0800


>John Kilburg writes:
>> >You also overlook the fact that Netscape's caching mechanism is far
>> >superior
>> >to that of X Mosaic and similar browsers. Did you know that X Mosaic
>> [...]
>> >Netscape will have a persistant disk cache that will dramatically
>> >decrease the total bandwidth requirements. So if anything, Netscape
>> >helps you as a provider provide more information with lower bandwidth
>> >requirements.
>>
>> Just so you know...
>>
>> Chimera has had a disk cache for several months...works nicely.
>
> As had the emacs browser, as now does Arena. Actually, Arena does the
>cacheing like I used to, so the possibility of a 'shared' disk cache is
>there. How does chimera do its disk cache (haven't had time to download it
>lately).
>
> I like the way arena does it with
>
>/tmp/protocol/hostname/path/to/file
[...]
> This worked really well - the conversion from URL->cachedname was fast,
>but it still ran into the 8.3 filename limit sometimes. Also, what happens
>when you request URLs in this order:
>
>1. http://www.cs.indiana.edu/hyplan -- actually a directory, should have '/'
> at the end, but suppose the server doesn't send a redirect. So it gets
> cached as /tmp/http/www/cs/indiana/edu/hyplan
>
>2. http://www.cs.indiana.edu/hyplan/wmperry.html, which should be cached as
> /tmp/http/www/cs/indiana/edu/hyplan/wmperry.html, but ..../hyplan is a
> regular file.

Chimera uses the MD5 signature of the URL truncated at 14 characters
as the cache filename. Someone else gave me the idea and at the
time I was worried about weird characters ending up in
filenames so it seemed like a good idea.

Using MD5 may sound like overkill but it was really simple to do and I
have yet to see a cache name collision (as far as I know) and noone has
complained. It seems fast enough. I also figured that MD5 will end
up in the code anyways to be used for one of the authentication schemes.

Chimera limits the size of the cache (default 4MB). It removes
the least recently accessed cache files when space is needed and if a
cache file is old (over 4 hours...a number I pulled out of thin air)
then chimera will retrieve the document from the source. This is
bogus but it is good enough. I read someplace that someone was using:

life_time_of_the_document = current_time - last_change_time

This seems like a good idea. How is it handled by emacs-w3 and Arena?

Chimera also puts MIME fields at the top of the cache file to provide
information about the contents.

-john