Re: The Superhighway Steamroller

Nathan Torkington (Nathan.Torkington@vuw.ac.nz)
Tue, 28 Jun 1994 15:46:20 +1200


Plenty of replies here ... I'm not Michael Hart's best friend, but I
do think he's been given a bum rap. --Nat

Jeff Barry writes:

> Many textual scholars can point out that texts which have been
> printed are not always accurate.
> [...]
> I can't speak for the academic library community, but I haven't ever
> spoken to a colleague who seems to have a high opinion of Mr. Hart
> and his project.

You do seem to rather miss the point. Project Gutenberg arranges the
production and distribution of electronic texts to real people as well
as to academics and librarians :-). I really don't care, when I read
books, if a semicolon or a comma were misused, or if two paragraphs
are accidently merged into one. I don't even care if a couple of
irrelevant words are missed out --- it's the *message* I'm after.

Sure, the current Project Gutenberg texts are never going to do
wonders for academics. But they made my life a hell of a lot easier
when I did English 105, when I was studying short stories, when I
needed quote the bible to someone.

There is nothing stopping anyone from producing ``definitive'' works
from Project Gutenberg texts. The bulk of the entry and formatting
has already been done for you. You can even *sell* the results, and
so long as you don't call it a Project Gutenberg text, nobody cares.

> IMHO, it seems that Hart is driven by his own egoistic pursuits.

Speaking as someone who spent an entertaining day or two in Michael's
company, the last thing he is driven by is his ego.

Willem Scholten <willem@futureinfo.com> writes:

> I would argue that reading Alice in Woderland as put in ASCI by the
> Gutenberg project, versus layed out propper with mark-up (HTML SGML,
> PS formatting whatever) conveys a much more powerfull message.

Yup. No disagreement there. That's why Project Gutenberg doesn't say
``you can't mark up or add value to this text''. However, you also
need a graphical terminal to read PostScript, and a specific set of
programs to read an HTML/SGML text. You need nothing but your local
equivalent of more(1) to read plain ASCII. Lowest common denominator.

> The ASCI text files as produced by Gutenberg are basically useless
> for any type of reasonable distribution.

A bold claim, but you don't back it up. I FTPed and line printed one
of Shakespeare's plays and annotated the print-out in preparation for
an English essay. Perfectly useable to me, and it saved me $9 as well
as giving something *sizeable* to annotate.

Note that I'm not saying raw ASCII is the only way. There are scripts
for FTP on ftp.ncsa.uiuc.edu, written by me, that convert some PG
texts to motorised books in HTML. I also don't agree with Michael
that WWW is a wasteful or negative technology. I wrote the original
FAQ, remember! :) I'm just saying that ASCII has its place.

nicka@mccmedia.com (Nick Arnett) writes:

> Of late, I've become much more interested in this because we're
> looking at employing refugees, students and others to do much the
> same for the Sarajevo library project. So I'm paying a bit more
> attention, but a little voice has told me not to solicit advice from
> P.G.

``Only the stupid person won't learn from another's mistakes'' :-)

You might not agree with the goals or mechanisms of Project Gutenberg,
but at the very basic level PG has a lot of experience in the scanning
and OCR field. I'd look closely at PG before you start your own
project --- there are a lot of important lessons about the quality and
control of volunteer labour to be learned.

Cheers;

Nat
(I don't want a flame war)