Don't worry, this is not a debate about "are robots good or bad".
Recognising that robots exist and will never go away, I have
setup a page devoted to gathering as much info about active
robots as possible: http://web.nexor.co.uk/mak/doc/robots/robots.html
(Please use this exact URL for all accesses).
It contains codes of practice for robot writers, a list of all known
robots in use, and most importantly a proposed standard that will
allow WWW server maintainers to indicate if they want robots to access
their server, and if so which parts.
This proposed standard doesn't require any server/client/protocol
changes, and can provide a partial solution to problems caused by
robots. I am inviting comments on it, but I do hope we can keep the
discussion focused, and not degenerate in a "robots are good/bad"
discussion that won't be resolved.
Robots are one of the few aspects of the web that cause operational
problems and cause people grief. At the same time they do provide very
useful services. This standard should minimise the problems and may
well maximise the benefits, so I think we need to sort this out as
soon as possible. The major robot writers are in favour of this idea,
so I don't see any fundamental problems.
PS: I do hope this gets out; www-talk as been empty for days...
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster