WAIS indexing with URLs

Kevin 'Kev' Hughes (kevinh@pulua.hcc.hawaii.edu)
Fri, 26 Nov 93 04:02:43 HST


Off of Marc's early documentation, I've put together the following
script using URL support in freeWAIS 2.0.2. However, using the feature
seems to break the -nocontents flag, so I've commented out the lines that
index image/code things. (It gets painfully slow if you have a lot of images.)
IMHO, I don't think it works all that well. I suppose I like
seeing the full URL more, but I still want it to refer to a real URL,
not this WAIS docid stuff.
For now, the index is going through

http://www.ncsa.uiuc.edu:8001/www.hcc.hawaii.edu:2010/index

-- Kevin

----

#! /bin/csh

set rootdir = /www
set index = /usr/local/etc/http/index
set indexprog = /usr/local/etc/http/waisindex
set url = http://www.hcc.hawaii.edu

cd $rootdir
set num = 0
foreach pathname (`du $rootdir | cut -f2 | tail -r`)
echo "Current pathname is: $pathname"
if ($num == 0) then
set exportflag = "-export"
else
set exportflag = "-a"
endif
$indexprog -d $index $exportflag -t URL $rootdir $url $pathname/*.html
$indexprog -d $index -a -t URL $rootdir $url $pathname/*.txt
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.ps
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.gif
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.au
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.hqx
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.xbm
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.mpg
# $indexprog -d $index -a -t URL $rootdir $url $pathname/*.c
@ num++
end
echo "$num directories were indexed."