> 1. take a base URL
> 2. retrieves all URL in the base document, but
> 3. do not goes outside the server (e.g. restrict the set of
> allowed URL),
> 4. minimum time between HEADs/GETs,
> 5. runs under unix (preferable SunOS 4.1 - i have ported software
> to hp-ux/solaris 2.x/dec osf/4.3bsd/aix/ultrix/sgi/linux)

I better clarify (4) - i would like to retreive all URL from
a site, but according to (4), have minimum time between two
GETs as to avoid overloading the server.

Answers to the query:

A). with
the "www-list" scripts (from

B). (MOMspider)
(from (Joshua Polterock))

C). with the
explore script ( (Cookie Monster))

D). Simon Spero <> have a set of programs
for benmarking.

E). (Robert S. Thau) has written a logfile replay program,
runs SunOS, which reports the main latency for every 100 transactions,
and which handle multiple outstanding requests. Found at

F). www2dot from (Reinier Post), it might no
fill the (4) requirement. Contact Reiner Post. Based on libwww2.

BTW, I probably try to use (C). For those interested, i'm running
a gateway (CGI based), which generates HTML pages on the fly.
I'm interested the above to profile the gateway (written in C).

