Re: The future of meta-indices/libraries

Steve Waterbury (waterbug@epims1.gsfc.nasa.gov)
Thu, 17 Mar 1994 20:33:36 --100


Stan Letovsky writes:

> How could topic indexes be maintained? There are sociological and
> technological components to the answer. The sociological answer is
> that every topic has a curator (group), and an associated network
> host....

This sounds like a good concept, although the actual
implementation will no doubt be determined by the interplay of
the sociological and technological components!

> The astute reader should now ask, How is this different from simply
> having keywords associated with documents ...?
> The answer is that the crucial flaw with that scheme is that
> there is no mechanism for coordination of keyword assignments. You say
> tomato and I say tomatoe .... A topic-curator system would allow
> this problem to be partitioned among a responsible community in a
> nonburdensome manner.

"Curators" of sorts already exist in some areas: standards groups.
IEC TC3 is creating an international standard dictionary of "data
elements" used in the description of electronics, for example.

The important point is the terms need to be "standardized". The
next important point is that context is often critical to terms'
meanings -- which means there will be needed at _least_ an
elementary form of "semantic model" -- something like an
"entity-relationship" model, in which the terms will have their
proper context, and on which the relationships between topic-servers
in different domains can be properly understood to enable cross-
domain queries (okay, kind of wild, but you know it will happen ...).

Anyway, thanks for sharing that vision, Stan. Great minds rant
alike!

Incidentally, I'm still busily implementing my own pet version:
using non-HTML SGML tags to identify data that needs to be
indexed in a document, and having special "agents" that would be
told what sites to go to and pull the info out of documents with
the tags they are looking for, to be brought back to a local
database, where the URL's/URN's would be stored along with the
indexed attribute data, so that local queries could be done and
the relevant docs summoned from wherever they live.

If anyone is curious, I have put a real Failure Analysis Report
into the format I have in mind:

http://epims1.gsfc.nasa.gov/fa/fa_82713.html

Check the HTML source for the SGML meta-data tags that would be
pulled out (with their instance data) by such an indexing agent.

This scheme is probably best adapted to engineering/scientific data,
but might be useful for other forms also.

BTW, if anyone from Stanford or Lockheed is listening, I would be
very interested in your thoughts, and whether you have any agent
software availble or adaptable to this.

Steve Waterbury
WWW Virtual Library: Engineering.
oo _\o
\/\ \
/
____________________________________________ oo ____________
"Sometimes you're the windshield; sometimes you're the bug."