Nick Arnett/Multimedia Computing Corp.
Thu, 2 Jun 1994

At 12:46 PM 6/2/94 -0500, Daniel W. Connolly wrote:

>(Have you seen Topic from Verity? It's a nice engine for doing
>just this sort of thing... it's natural-language based.)

Oh, yes, indeed I have. I'm familiar with the commercial products that do
things like this. In fact, a lot of my ideas and strategies have come out
of kicking around ideas with some people who were involved in its creation.
One of my best friends from high school, many ages ago, manages the
Minerva project at Booz-Allen & Hamilton, which bought ADS, the company
from which Verity spun out.

>>In any event, I *really* would like to see explicit support for this sort
>>of thing in HTML.
>Why? It seems like a separate data format might work better for
>your app.

But the whole idea is to use the Web as a communications mechanism. I
don't particularly want to be off on my own island.

>I can imagine a set of navigation mechanisms based on this sort
>of information. I don't see that HTML is the most convenient
>representation of the information, though.

Heck, no, but the Web's momentum is hard to ignore. I'm really happy to
work in an environment in which dozens of people and companies are trying
to build the best servers and browsers. I don't think there is anything

>If you've got a specific way to attack the resource discovery problem
>with these semantic tags, I'd be very interested.

I do, but it's still in the idea and prototype stage. And I'm not ready to
share it in public, except to say that it's collaborative. And uses an
organic model.

>But if you just want a general mechanism to express machine-readable
>semantic information, you're barking up a non-existent tree. From
>what I've learned from the knowledge-representation researchers,
>the best way to exchange arbitrary semantic information between
>domains is to write it out in the natural language. Then the
>domain-specific parser extracts the parts it's interested in.

Ugh. Although I'll admit that that's true, there are some promising lines
of research into things such as merging pre-existing taxonomies into
semantic networks and such.

But where I'm really headed is the creation of an environment in which many
natural language processors, er, I mean people, are involved in creation
and maintenance of a web of information. That oversimplifies radically,
though. But it's why the Web and HTML are critical -- they're the means to
allow really large numbers of people to participate. Check out my essay,
<a href="http"//">Mendicant Sysops in
CyberSpace</a> for a bit of the philosophy behind this.

Perhaps a concrete example will help. My main practical project right now
is establishment of a net-based library for Sarajevo. Its creation and
maintenance will involve ex-Yugoslavians who are scattered all over the
world, thanks to the diaspora. I'm prototyping tools that will allow these
people to collaborate on structuring and maintaining a set of information
resources that reflects their collective points of view. One of the areas
in which they'll collaborate is in determining how the documents in the
library are tagged for various purposes.

I should emphasize that I don't expect people to collaborate as one big
homogenized group, but as many subcommunities whose work as a whole is
interconnected. Nor do I expect that more than 5-10 percent of the library
"members" will get involved in the collaborative process. But when you're
talking about reaching thousands of people, even if only a few percent get
involved, that's a lot of "natural language processing" power.

Bill Gates talks about "Information at Your Fingertips." I think that "A
Librarian at Your Fingertips" is much more interesting. Not that I think
Bill doesn't understand...


