Traversal program

Lou Montulli (montulli@stat1.cc.ukans.edu)
Mon, 12 Apr 93 17:01:08 CDT

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Lou Montulli: "HTML to Postscript"
Previous message: murphy@dccs.upenn.edu: "Where is the WWW client for Macintosh these days?"

Last week there was quite a bit of talk about traversing the Web
to compile a list of all Web documents.

I have tweeked my traversal program a little for WWW documents
and it looks like it will work.

The traversal program only attempts to follow http: links and
keeps a list of all links as they are traversed.

The question I have now is: Should I run it? I'm not entirely
sure what it will do. It will certainly put a big load on
the network. Will it get bogged down in some incredibly dense
subtree? What day of the week would be the best time to run?

Currently the URL, Document Title and the link name that referenced
the document are saved in a Tab delimiter format. Are there
any tabs in document titles?

What does everyone else think?

:lou

-- 
  **************************************************************************
  *           T H E   U N I V E R S I T Y   O F   K A N S A S              *
  *         Lou  MONTULLI @ Ukanvax.bitnet         			   *
  *                         Kuhub.cc.ukans.edu               	           *
  *  Nothing difficult,     Ukanaix.cc.ukans.edu    ACS Computing Services *
  *   is ever easy!         	913/864-0436	       Lawrence, KS 66044  *
  *					         			   *
  *  For how we live is so different from how we ought to live that he who *
  *  studies what ought to be done rather than what is done will learn the *
  *  way to his downfall rather than to his preservation.  -Machiavelli    *
  **************************************************************************

Next message: Lou Montulli: "HTML to Postscript"
Previous message: murphy@dccs.upenn.edu: "Where is the WWW client for Macintosh these days?"