Re: Client-side searching proposal

John Franks (john@math.nwu.edu)
Thu, 26 Jan 1995 03:13:35 +0100


In article <199501252331.AA08751@mail.crl.com> you write:
>
>
>The WN server does a server-side interpretation of URL parameters,
>
> /dir/foo.txt;lines=10-20
> ^
>
>would be interpreted as a requiest for lines 10-20 of "foo.txt".
>(See <URL:http://hopf.math.nwu.edu/docs/range.html>).
>

Actually the WN server does quite a bit more in the way of server
side implementations of the ideas in this thread.

/dir/foo;bytes=111-200

will return bytes 111-200 of any file foo. There are lots of interesting
things one can do with this. (Before you ask, a range from an HTML document
is returned as text/plain.)

But the thing that is close to what is being discussed here is

/dir/foo.html;mark=71,22,30#WN_mark

This returns file foo.html with a tag like <a name=WN_mark>xxxxxxx</a>
surrounding bytes 22-30 of line 71, so that the browser goes to the
correct line and highlights the selected bytes. (It's a little more
complicated than this if the bytes are inside an anchor).

This is used for full text searches of multi-file HTML documents which
return lists of all the lines with a match with matching words highlighted
and linked to their location in the document. The search is done by
the server (there is no way for the client to search the multiple files
of the document) and the list of matches is constructed on the fly.

To see this in ACTION take a look at the index to the HTML 2.0
Specification which is at

<http://hopf.math.nwu.edu/html2.0/docindex.html>

and look up an index item like ACTION.

I think a client side mechanism like this to mark the first occurence
of a particular pattern is a great idea. It has some advantages and
some disadvantages over a server side version like the one in WN.

The big advantage is that clients already have canonicalized the text
for searching so you don't have to worry about what to do it the match
is in an anchor or in a tag, or *is* a tag.

A client side limitation is that I don't know a good way to return
*all* the matches for a pattern in a document. But that might
be doable. A more serious limitation is there is no good way to
search multiple files in a single conceptual HTML document. And that is
what the user really wants to do. This is something that I think
WN handles well.

John Franks