Re: CGI server indexing with WAIS

Rob McCool (robm@ncsa.uiuc.edu)
Fri, 17 Dec 1993 05:03:33 -0600


/*
* Re: CGI server indexing with WAIS by Tony Sanders (sanders@BSDI.COM)
* written on Dec 16, 6:14pm.
*
* > Hi, gang, I've taken Tony Sanders' PERL script which uses freeWAIS to search
* > an index of a server and ported it to CGI. What this means is that if you
* > index your HTTP server with wais, you can use this script to search it.
* Neat, please mail me a copy (of your wais.pl).

Attached. I'm not a serious perl user so my techniques may not be the best,
but it appears to work.

* I'll pick up 1.0 soon and take a peek at it. Hopefully sometime early
* next year I'll have more free time to hack on Plexus and make it
* CGI compliant. Great work BTW. I'm glad we did this early on
* so folks will have plug-and-play servers for many things.

I'm hoping it takes off. I wish I had some time to go and update the
documentation; I'm getting truckloads of confused questions because the CGI
spec assumes a fair bit of previous knowledge.

* One of my first projects when I get back to is to make a perl package
* of support functions for people to use to write CGI compliant scripts
* in perl (you probably have the same thing for C). Basically this
* just means packaging up a lot of the funnctions I already have and
* making some minor changes.
*/

I just got a reference to a perl cgi library from someone, I haven't gotten
a chance to look at it. It's at http://www.bio.cam.ac.uk/cgi-src/cgi-lib.pl

--Rob

#!/usr/local/bin/perl
#
# wais.pl -- WAIS search interface
#
# $Id$
#
# Tony Sanders <sanders@bsdi.com>, Nov 1993
#
# Example configuration (in local.conf):
# map topdir wais.pl &do_wais($top, $path, $query, "database", "title")
#

$waisq = "/usr/local/bin/waisq";
$waisd = "/u/Web/wais-sources";
$src = "www";
$title = "NCSA httpd documentation";

sub send_index {
print "Content-type: text/html\n\n";

print "<HEAD>\n<TITLE>Index of ", $title, "</TITLE>\n</HEAD>\n";
print "<BODY>\n<H1>", $title, "</H1>\n";

print "This is an index of the information on this server. Please\n";
print "type a query in the search dialog.\n<P>";
print "You may use compound searches, such as: <CODE>environment AND cgi</CODE>\n";
print "<ISINDEX>";
}

sub do_wais {
# local($top, $path, $query, $src, $title) = @_;

do { &'send_index; return; } unless defined @ARGV;
local(@query) = @ARGV;
local($pquery) = join(" ", @query);

print "Content-type: text/html\n\n";

open(WAISQ, "-|") || exec ($waisq, "-c", $waisd,
"-f", "-", "-S", "$src.src", "-g", @query);

print "<HEAD>\n<TITLE>Search of ", $title, "</TITLE>\n</HEAD>\n";
print "<BODY>\n<H1>", $title, "</H1>\n";

print "Index \`$src\' contains the following\n";
print "items relevant to \`$pquery\':<P>\n";
print "<DL>\n";

local($hits, $score, $headline, $lines, $bytes, $type, $date);
while (<WAISQ>) {
/:score\s+(\d+)/ && ($score = $1);
/:number-of-lines\s+(\d+)/ && ($lines = $1);
/:number-of-bytes\s+(\d+)/ && ($bytes = $1);
/:type "(.*)"/ && ($type = $1);
/:headline "(.*)"/ && ($headline = $1); # XXX
/:date "(\d+)"/ && ($date = $1, $hits++, &docdone);
}
close(WAISQ);
print "</DL>\n";

if ($hits == 0) {
print "Nothing found.\n";
}
print "</BODY>\n";
}

sub docdone {
if ($headline =~ /Search produced no result/) {
print "<HR>";
print $headline, "<P>\n<PRE>";
# the following was &'safeopen
open(WAISCAT, "$waisd/$src.cat") || die "$src.cat: $!";
while (<WAISCAT>) {
s#(Catalog for database:)\s+.*#$1 <A HREF="/$top/$src.src">$src.src</A>#;
s#Headline:\s+(.*)#Headline: <A HREF="$1">$1</A>#;
print;
}
close(WAISCAT);
print "\n</PRE>\n";
} else {
print "<DT><A HREF=\"$headline\">$headline</A>\n";
print "<DD>Score: $score, Lines: $lines, Bytes: $bytes\n";
}
$score = $headline = $lines = $bytes = $type = $date = '';
}

eval '&do_wais';